Friday, January 27, 2012
Shot Selection and the Secretary Problem
Unfortunately, that's not the way things work in this problem. You only get the candidates one at a time, and you have to decide then and there whether or not to hire them or not. Once you've rejected a candidate, they're lost to you forever. One could, theoretically, lose the best candidate on the very first interview.
The question then is, what is your best strategy, and what is your probability of making the best possible hire using that strategy?
—well, let's just say an unsuccessful suitor and his head are soon parted. But it wasn't entirely unproductive; I eventually formulated a variation called the iterated sultan's dowry, in which a second suitor, seeing the first suitor's unsuccessful head roll down the hill, gets to use the information in choosing a prospective mate, and then the third, the fourth, etc. This variation has an interesting solution which is unfortunately too large to fit into this parenthetical comment.)
It can be shown, fairly easily, that the best strategy must be of the form "Skip the first n candidates, recording their suitability ratings. Then choose the next candidate whose rating exceeds theirs." The reason is that as you plow through the candidates, the probability that the best one is yet to come never increases, whereas the probability that you've already encountered the best one never decreases. So the question reduces to figuring out what the right choice for n is.
Ultimately, following a strategy like this, you could end up choosing no candidate at all if the best candidate is already in the first n, since you've already skipped all of those. But if you do pick a candidate, it will be number k > n.
For that one to be the best overall, the best must be in the last 100-n. Furthermore, the second best of the first k (that is, the best before the ultimate choice) must belong in the first n. Now, let's work out the probability that both of these happen. In order to do this, we have to break down the possibilities into all the different cases.
The first case is that the best candidate is the very next one—candidate number n+1—which happens with probability 1/100. You'll choose that one provided that there's no other candidate between the first n and number n+1 that is better than the first n. Since there are no candidates in between, that probability is 1. So the incremental probability for this case is 1/100 times 1, or just 1/100.
The second case is that the best candidate is the one after that—candidate number n+2—which again happens with probability 1/100. You'll choose that one provided that there's no candidate between the first n and number n+2 that is better than the first one. That will be true provided the best of the first n+1 happens within the first n, so the incremental probability for this case is 1/100 times n/(n+1).
Following the same line of reasoning, the third case—that the best candidate is number n+3—provides an incremental probability of 1/100 times n/(n+2), the fourth case provides an incremental probability of 1/100 times n/(n+3), etc., until the last case—that the best candidate is number 100—provides an incremental probability of 1/100 times n/99.
Putting all these cases together, this strategy "wins" with probability
n/100 × [1/n + 1/(n+1) + 1/(n+2) + · · · + 1/99]
It can be shown, using relatively straightforward calculus, that this expression reaches a maximum when n = 37, and yields a probability of success of about 0.37.
That's not a coincidence, incidentally. For large candidate pools (and a hundred candidates qualify as a large pool), of size N, the best strategy is to skip the first n = N/e, where e = 2.71828+ is the base of the natural logarithm, and to take the earliest best candidate thereafter. The approximate probability of success (that is, choosing the very best candidate of them all) is very close to 1/e = 0.36787+.
For a lot of people (including myself), that's rather stunning. It implies that even if you have a million candidates, you have a strategy that picks the very best one of them with better than a one-in-three chance.
The reason I'm putting basketball in the mix is that there's a fairly straightforward application to a vital aspect of scoring: shot selection.
Consider: A possession in basketball lasts for anywhere from 0 to 24 seconds (neglecting offensive rebounds). You can't always guarantee that you'll make the shot; the next best thing (at least before the endgame) is to select the very best shot—that is, the shot that has the best probability of going in (neglecting fouls and three-point shots).
In other words, ahem, optimal shot selection.
But you don't always know when that best shot is going to come, especially when you're working out of a halfcourt set. Is it the very first one? Is it the next best one? Maybe the best one will come at least twenty seconds into the possession. You just don't know. But maybe, now, you have a rule of thumb for selecting that best shot. You skip the ones that come in the first 24/e = 9 seconds (approximately), and take the next best one that comes thereafter.
Obviously, this rule makes lots of assumptions, such as (a) the defense is equally tenacious across the entire possession, (b) the offense is equally productive of shot opportunities across the entire possession, (c) the best shot opportunity is equally likely to come at any time during the entire possession, etc. But to the limited extent that these assumptions are approximately valid, it's not a bad rule of thumb. It suggests that the Phoenix Suns of the early-to-mid-2000s were a bit hasty.
But not by much. Just a second or two.