Friday, January 28, 2011

How to Be Wrong, With Statistics!

Please, just stop it. You're hurting me.

Anyone who understands statistics at all cannot dispute that Kobe Bryant does not perform well statistically, in the clutch. But anyone who understands statistics well cannot dispute that the current statistics are woefully under-equipped to discern who is the clutchiest player in the league.

Look: Nothing happens in a vacuum. We look at crunch-time statistics because it's the most exciting part of the game, when it happens. But it's only one way to condition a play.

What do I mean by condition? I mean "to restrict the characteristics of." With respect to comparing players on their clutchiosity, the objective should be to condition the crunch-time plays sufficiently that we are comparing apples to apples, and oranges to oranges. And here, as with many other aspects of basketball, we simply don't have the statistics to do it at our disposal.

For instance, suppose that we wish to compare two players, A and B. Suppose that A's offensive efficiency (points per possession) is greater than B's, with less than 24 seconds on the clock and the team tied or down no more than three points. Does that mean that A is clutchier than B?

Not at all. If B has stiffs for teammates, compared to A, then he's likely going to be faced with tighter individual defense than A, and likely earn a lower offensive efficiency than A. That's a couple of instances of "likely" in there, but the point doesn't have to be ironclad, it just has to be plausible, even probable. We just don't know enough to conclude with anything approaching certainty that A is clutchier, because we haven't conditioned on the teammates. (Or the defense, for that matter.)

Observe that this is mostly independent of what statistic you use to measure clutchiness. Suppose, instead, that you decide to use win probability increment. A player's ability to increase his team's likelihood of winning is still going to be affected by his teammates: If he passes, they will have a lower probability of scoring; if he doesn't, the defense can afford to defend him more tightly.

Of course, maybe you're OK with this kind of quality vacillating with things like which teammates a player has. But personally, I think such a measure has a certain ephemeral aspect that we don't usually associate with clutchiness.

The problem is, how can you possibly condition on the kind of teammates that a player has? Players don't change teammates the way they change their clothes (or at least they shouldn't). So what do you do?

Here's my gentle suggestion: Stop trying to answer these abstract questions statistically. I've been using outlandish forms of the word "clutch" to underscore this, in case you haven't noticed, but my point is serious. Use statistics to answer the questions they can. As the field advances, we'll be able to answer more of these questions, but in the meantime, use the same method we've been using all along: subjective observation. Western civilization didn't break down before we had PER. Nothing hinges on who people outside the game think is clutch. And mostly, stop pretending to any degree of certainty in the matter, just because a number is attached to it.

EDIT: Since I'm a fan of Kobe Bryant, one might reasonably wonder whether or not I've got a built-in bias against crunch-time statistics, since almost all of them (except perhaps a raw count of shots made in crunch time, as opposed to efficiency) point to quite a few players as being superior in the clutch. Obviously, I can't deny said bias. Quite possibly I would not be making these same arguments, or making them with quite the same degree of vehemence, if those statistics showed Bryant in a better light.

That being said, however, I don't think the question of using statistics to examine clutchitude should be predicated on how well they accord with conventional wisdom (where Bryant is, indeed, king of clutch). In my opinion, there are quite compelling fundamental arguments that straightforward linear classifiers such as PER or offensive efficiency or wins produced, conditioned on crunch time or not, are simply not reliable indicators of individual performance, and those arguments would remain valid regardless of whether I espoused them, or of whom they revealed to be the top performers, in crunch time or in the game overall.

Wednesday, January 5, 2011

Voter Mixing Equals Criterion Mixing

I'm going to talk about basketball and probability again. Wasn't that obvious from the title of this post?

It's apparently never too early to talk about the MVP award for the NBA. We're coming up on the halfway point of the season, and writers have been tracking the MVP candidates for, oh, about half a season. Nobody takes them seriously until about now, though.

One side effect of the question being taken seriously is that some wag will point out that the MVP is not—and has never been—defined precisely. In fact, I can't find anywhere where it's been defined at all by the NBA, precisely or otherwise. That leaves the voters (sportswriters and broadcasters, mostly, plus a single vote from NBA fans collectively) to make up their own definition, a situation that said wag invariably finds ludicrous.

Well, here's one wag that finds this situation perfectly acceptable. Desirable, even.

Listen: There is no way that everybody will ever agree on a single criterion for being the "most valuable player." Most valuable to whom? The team? The league? The fans? Himself? (I can think of a few players who certainly aim to be most valuable to themselves.) And what kind of value? Wins? Titles? Highlights? Basketball is entertainment, after all. There are just too many different ways to evaluate players.

Instead, we might imagine that some writers would get together at some point and define MVP as a mixture of criteria. For instance, the title of MVP could be based in equal parts—or inequal parts, for that matter—on individual output, contributions to team success, and entertainment value.

Except, I'd argue that that is exactly what we've been doing for all these years. We have all these voters, all of whom have differing ideas of what the MVP does (or should) stand for. Some people think it should be based on individual statistics (Hollinger's Player Effectiveness Rating, or PER, is a current favorite). Some people think it should be based, at least in part, on team success, so team wins are an input to the decision (a 50-win minimum is a popular threshold). Still others dispense with explicit criteria altogether and vote based on reputation or flash.

Well, if exactly the same number of voters take each of those different perspectives on MVP, then we will have an MVP based in equal parts on individual output, contributions to team success, and entertainment value. And if more voters lean on individual output than on entertainment value, then the MVP make-up will show that same leaning. Voter mixing equals criterion mixing!

What's more, this criterion mixing is automatic. No committee needs to be formed, and the exact mixture evolves as the voter population evolves. If someday team success becomes more important to the basketball cognoscenti, then it'll automatically have a larger impact on MVP selection. No redefinition is necessary.

Can this equivalence be demonstrated on any kind of formal level? In something as complex as basketball, my guess is not. But it's close enough, and intuitive enough, that I think it just doesn't make sense to gripe about the MVP lacking a precise definition. As long as each voter comes to their own decision about what it stands for, we'll get the mix that we should.