Another meandering post. You've been warned.
I'm re-reading Isaac Asimov's informal autobiography, I. Asimov (a play on his collection of robot stories, entitled I, Robot, and to be distinguished from his formal autobiographies published earlier in his life), and finding it quite entertaining. Partly, this is because I'm an inveterate re-reader and re-watcher. My enjoyment of a piece of writing or a movie or a TV program doesn't diminish because I know how it goes. If I enjoyed it the first time, I'll enjoy it just as much the seventh time, or the fifty-seventh. Even a sporting event isn't diminished because I know how the final score (although I do prefer to watch it live the first time, if I can). All this just by the way.
Anyway, in this book, Asimov mentions his facility at giving impromptu talks, and mentions by way of illustration that he has given a couple of thousand talks, no two exactly alike.
And that phrase, "no two exactly alike," is so characteristic of snowflakes that I immediately thought of them. In fact, I'd go so far as to wager that if you asked people what the first thing was that they thought about snowflakes, it would be that no two are alike.
But is that actually so? Have there really never been two snowflakes alike? If you're like most people, you'd probably just as soon leave well enough alone and assume it's true. For the heck of it, though, take a trip with me down the rabbit hole.
The whole idea that no two snowflakes are exactly alike has been around for time immemorial, but things really got moving with a man named Wilson Bentley (1865-1931), who grew up in Vermont. When he was fifteen, his mother gave him an old microscope to experiment with. Well, Vermont winters being what they were, I suppose it's natural that Wilson should have been drawn to snowflakes. And so he took to maneuvering snowflakes under his microscope and sketching them.
It turned out, however, that they melted quickly—far too quickly for him to sketch in time. So Bentley assembled a contrivance, a camera attached to a microscope attached to a board covered with black velvet, which permitted him to take pictures of the snowflakes before they melted. Over his lifetime, he took images of over five thousand snowflakes, and sure enough, no two of them were exactly alike.
Five thousand, though a lot to take pictures of, is still a minuscule fraction of all the snowflakes that ever were, or even of those that are currently in existence (a constantly changing population, to be sure). Surely there is no way that we could possibly take pictures of all the ones that currently exist, let alone those that have ever existed. Is there perhaps another way to answering the question?
Consider: Each year, a substantial portion of the Earth is hit by snowstorms sufficient to dump several meters of snow on the ground. I'm not sure of my statistics, but we probably wouldn't be far off if we assumed that the total annual snowfall amounted to a depth of, let's say, two tenths of a meter over the entire surface of the Earth, if it was spread around evenly. Since the surface area of the Earth is about 5×10^14 square meters, we're talking about 10^14 cubic meters of snow. When packed tightly (tightly enough to crush them), snowflakes might occupy a cube about a tenth of a millimeter on a side. So each year, we get something like 10^26 snowflakes. Taking into account the fact that there has been snowfall for a few billion years, there have been perhaps 10^36 snowflakes, ever, in the Earth's history. That's a lot of snowflakes.
However, there are also lots of different shapes that any particular snowflake might take on. Snowflakes exhibit six-fold symmetry because they're constructed from ice crystals, which have six-fold symmetry. (You can find a picture of one in this article.) So let's represent a snowflake as a hexagonal lattice, a bit like a honeycomb of cells, each of which might be occupied by an ice crystal, or not. An individual hexagonal ice crystal is a few tenths of a nanometer across, whereas an entire snowflake might be a few tenths of a millimeter across. So the hexagonal lattice representing our snowflake would have a diameter of about a million cells, and would contain about 750 billion cells in all.
Does that mean that there are nearly a trillion possible snowflakes? No, because each one of those cells could either have an ice crystal, or not. We could represent the snowflake by filling each one of those cells with a 1 if it had an ice crystal, or a 0 if it did not. In other words, each snowflake would be represented by a huge binary number with 750 billion digits. Such a tremendous number is on the order of 10 raised to the 230 billionth power.
It's hard to overstate how big a number this is. Even if you were to write a digit a second, every second of every hour of every day, without interruption for sleep or eating, you have perhaps only an even-money chance of just writing this number out during your entire lifetime. It goes without saying that it's much, much, much larger than 10^36. (It is, however, much smaller than a googleplex. I just thought I'd point that out.)
However, we're not playing quite fair, because we've completely neglected the symmetry exhibited by most snowflakes. If we take that into account, it turns out that the number of possible snowflakes drop to something more like 10 raised to the 40 billionth power. Quite a bit smaller, but still much larger than 10^36.
There's another thing, too. Bentley took his photographs with an optical microscope, which was of course incapable of resolving ice crystals down to the individual molecular level. These days, we're now capable of doing that, but it would be unfair to insist that snow crystals, which in an ordinary atmospheric environment would be constantly changing anyway, be identical to that level of precision. A typical photograph of a snowflake might be able to resolve crystals to a level of detail that would take a hundred thousand cells to fill the entire snowflake. Remembering to take into account the symmetry of snowflakes, there could still be on the order of 10^5,000 different snowflakes, at this reduced level of resolution. Still much larger than 10^36.
OK, how about this? If one looks at an array of Bentley's photographs, one notices that the ice crystals are not arranged haphazardly around the snowflakes, even after one takes into account the six-fold symmetry. Instead, there is order at all different scales. In fact, people have likened snowflakes to fractals; there are even simulations of snowflake generation that build upon the fractal arrangement.
That reduces the level of variation accessible to the snowflake. It's hard to say for sure, but in most of the Bentley images, I think one can make out about six levels of detail. (That's consistent with a scale ratio of about two to three.) What's more, each unit of detail has within it detail that only goes about three or four levels down, which means that each level can be represented using about fifty bits or so. That means a total of three hundred bits might suffice to denote a snowflake to the level of precision needed to figure out whether they match or not. That would still mean about 10^90 distinct snowflakes, though.
All right, one last thing, which at first will seem to be a significant digression. There is, in probability, something called the birthday paradox, which goes something like this: Suppose you get fifty otherwise randomly selected people together in a room. What are the odds that at least one pair of them will share the same birthday (possibly different year)? One in four? One in ten? How many people do you think you need to make the odds even? Would forty do it? How about sixty? A hundred?
The answer, surprising to most people who haven't heard this question before, is that the odds are about even that out of just 23 people, at least one pair will share a birthday in common. It's a bit surprising because there are 365 days in a year (not counting leap day), but consider what happens if you choose the people one by one. The first, of course, can have any birthday at all. In order to avoid a pair sharing the birthday, the second must not share a birthday with the first. The third must avoid sharing a birthday with both the first and the second. The fourth must avoid sharing a birthday with the first, the second, and the third. And so on. By the time you get to 23 people, there are about 250 birthday sharings that must be independently avoided. It's not surprising that such sharings are avoided only half the time.
It turns out that this "paradox" (not truly a paradox at all, naturally, but just a counter-intuitive result of probability theory) has very wide applicability. The number of samples that can be randomly selected before you stand a good chance of getting a pair is much smaller than the total number of choices. In fact, it's on the same order as the square root of the number of choices. (There's that square root again!) The square root of 365 is a bit over 19, and sure enough, 23 isn't very far over 19. If one takes into account the year of birth over the course of a century, then there are about 36,500 different birthdates, but the square root of 36,500 is only about 191, so that only about 200 randomly selected people are needed before you have a good chance of matching the entire birthdate. And the square root of 10^90 is 10^45, so the size of the collection of snowflakes you need to have a good chance of pairing two of them is about 10^45.
It's more than 10^36, but not much more. (What's a factor of a billion between friends?) And there are a lot of back-of-the-envelope manipulations in what I wrote, so perhaps there are other deeper symmetries to take advantage of. But I think it's rather magical that the numbers work out nicely so that it's quite possible that somewhere, across the vast mists of time, there were at (probably very different) points, two identical snowflakes!
Wednesday, March 14, 2012
Wednesday, February 15, 2012
Slip Sliding Away
Here's a counter-sliding game I came up with a while back, while visiting my parents.
My parents have these small flags of various countries, which can be stood up, UN-style, because they're on flagpoles that are stuck into circular bases. The flags can be removed from the bases to be waved, and when they are, you're left with just the circular bases. One day, while idly sliding them around the table, I thought about using them for various geometrical exercises. Of course, if one doesn't have small circular flagpole bases, one can use any kind of equally sized circular tokens; any kind of circular coin should work just fine.
The rules I set up for myself were as follows:
The space between the two counters is not wide enough to fit the center counter through (in fact, that space has a width only √3 - 1 = 0.732+ times as wide as necessary), so it cannot be slid out in accordance to Rule 3, above. You might like to see if you can figure out a solution for the hollow hexagon before reading on.
The trick is to set up support for the fifth corner first, then slide out the center counter to become the fifth corner; the sixth corner is then easily slid into place. Begin by placing six counters in a parallelogram arrangement:

Then slide counter 2 to touch counters 4 and 6:

Now slide counter 4 into the place previously occupied by counter 2:

Finally, slide counter 1 around to touch counters 2 and 4, at the sixth and last corner of the hexagon. Voilà!

I leave you with two puzzles, one fairly simple, and one open (that is, unsolved):
My parents have these small flags of various countries, which can be stood up, UN-style, because they're on flagpoles that are stuck into circular bases. The flags can be removed from the bases to be waved, and when they are, you're left with just the circular bases. One day, while idly sliding them around the table, I thought about using them for various geometrical exercises. Of course, if one doesn't have small circular flagpole bases, one can use any kind of equally sized circular tokens; any kind of circular coin should work just fine.
The rules I set up for myself were as follows:
- One starts out with two touching counters. This counts as two moves. (For "historical reasons.")
- On any subsequent move, one may add a counter; this counter must touch two existing counters on the table. (There is an exception, which I will mention later, in connection with an outstanding puzzle.)
- Or, one may remove a counter. One must remove the counter by sliding it, however, not by lifting it up off the table.
Here, counters 1 and 2 are placed first. One may then place counters 3, 4, and 5 in that order. Removing counters 3 and 4 then leaves a straight line of three counters. One could not construct that straight line directly, by just putting down counters 1, 2, and 5, because counter 5 would not have been placed in contact with two counters.
One could continue twice more around counter 2, creating a filled hexagon of seven counters. If, however, one wanted to create a hollow hexagon, one would have to remove that middle counter at some point. It seems tempting to place one more counter below counters 2 and 5, and then remove counter 2 to place at the last corner of the hexagon, but the following diagram shows why that won't work:
The trick is to set up support for the fifth corner first, then slide out the center counter to become the fifth corner; the sixth corner is then easily slid into place. Begin by placing six counters in a parallelogram arrangement:

Then slide counter 2 to touch counters 4 and 6:

Now slide counter 4 into the place previously occupied by counter 2:

Finally, slide counter 1 around to touch counters 2 and 4, at the sixth and last corner of the hexagon. Voilà!

I leave you with two puzzles, one fairly simple, and one open (that is, unsolved):
- Follow the above rules to construct a hollow triangle of side 4 (just like the arrangement in ten-pin bowling, but without the center pin), in as few moves as possible. There is more than one solution.
- Suppose we add an exception to Rule 2, above: We permit a counter to be placed in an arbitrary location on the table, but with the proviso that no required property of the final arrangement can depend on the exact location of that counter. (For instance, a construction of a rectangle that depends on a counter being placed somewhere between 1 and 2 counter widths away from another is OK, but one that depends on it being placed exactly 1-1/2 counter widths away is not.) In that case, is it possible to construct a perfect square of four counters, of any side length? The sides of the square need not be filled in with any counters.
Monday, February 13, 2012
Weighted Fair Division
I'm sure this is an old puzzle somewhere in the world, but it came upon us a few years ago here at work in connection with driving to lunch.
Where I work, the company provides a cafeteria where one may purchase lunch. Unfortunately, the lunch is either too expensive or not good enough, depending on your point of view, so we generally eat out. We're lucky that we can do that. Anyway, in general, we try to take turns driving so that we're all likely to drive about equally often. It doesn't always work out that way, but that's the aim.
If we all ate out every meal, it'd be simple; we'd all drive with equal probability. But what happens if, as has been the case occasionally throughout the years I've worked here, one of us can only eat out once per week? How often should that person drive, when they do eat out?
To make things simpler, let's assume that there are two of us daily diner (five times per week), and one single-day diner. Four days out of the week, there are only two diners. If each one drives one-half of the time, then both of them end up driving two days out of the four.
On the last remaining day of the week, when there are three diners, should each drive one-third of the time? Well, if we do things that way, then each of the two diners drives 2-1/3 days, on average, whereas the single-day diner drives just 1/3 day per week, on average. That's not fair, because the two daily diners drive seven times as much as the single-day diner, even though they only eat five times as often. The single-day diner should have to shoulder more of the driving burden on that one day.
Let's denote by p the probability that the single-day diner drives on that day. Then the two daily diners drive on that day with probability (1-p)/2, and over the course of the week, they drive (5-p)/2 days, on average. According to our fairness metric, we must find p such that (5-p)/2 = 5p, which yields
5-p = 10p
11p = 5
p = 5/11
So the single-day diner should drive nearly half of the time, on those days when he or she joins the two daily diners. By a similar line of reasoning, if there are three daily diners, that probability drops to 5/16, and in general, with n daily diners, the probability is 5/(1+5n), with each of the daily diners driving five times more often, or 25/(1+5n).
What happens if there are m one-day diners (each of them eating on the same day)? Then the probability p that any of the one-day diners should drive on that one day drops even further, to 5/(m+5n).
One might well consider (providing one is still reading) extending these to k-day diners, and whether the results depend on the k-day diners eating on the same k days, or if the results are insensitive to the distribution of those k days.
Where I work, the company provides a cafeteria where one may purchase lunch. Unfortunately, the lunch is either too expensive or not good enough, depending on your point of view, so we generally eat out. We're lucky that we can do that. Anyway, in general, we try to take turns driving so that we're all likely to drive about equally often. It doesn't always work out that way, but that's the aim. If we all ate out every meal, it'd be simple; we'd all drive with equal probability. But what happens if, as has been the case occasionally throughout the years I've worked here, one of us can only eat out once per week? How often should that person drive, when they do eat out?
To make things simpler, let's assume that there are two of us daily diner (five times per week), and one single-day diner. Four days out of the week, there are only two diners. If each one drives one-half of the time, then both of them end up driving two days out of the four.
On the last remaining day of the week, when there are three diners, should each drive one-third of the time? Well, if we do things that way, then each of the two diners drives 2-1/3 days, on average, whereas the single-day diner drives just 1/3 day per week, on average. That's not fair, because the two daily diners drive seven times as much as the single-day diner, even though they only eat five times as often. The single-day diner should have to shoulder more of the driving burden on that one day.
Let's denote by p the probability that the single-day diner drives on that day. Then the two daily diners drive on that day with probability (1-p)/2, and over the course of the week, they drive (5-p)/2 days, on average. According to our fairness metric, we must find p such that (5-p)/2 = 5p, which yields
5-p = 10p
11p = 5
p = 5/11
So the single-day diner should drive nearly half of the time, on those days when he or she joins the two daily diners. By a similar line of reasoning, if there are three daily diners, that probability drops to 5/16, and in general, with n daily diners, the probability is 5/(1+5n), with each of the daily diners driving five times more often, or 25/(1+5n).
What happens if there are m one-day diners (each of them eating on the same day)? Then the probability p that any of the one-day diners should drive on that one day drops even further, to 5/(m+5n).
One might well consider (providing one is still reading) extending these to k-day diners, and whether the results depend on the k-day diners eating on the same k days, or if the results are insensitive to the distribution of those k days.
Tuesday, February 7, 2012
Roll Over, You Pats!
This past weekend's Super Bowl XLVI (that's forty-six) provided yet another confluence of probability, tactics, and sports. That's never a bad thing.
I'm speaking, of course, of the decision on the part of Patriots coach Bill Belichick to permit the Giants to score on second down and goal from the Patriots' six-yard line, with about a minute left in the game. The Patriots put up only token defense, so that when Ahmad Bradshaw took the handoff from Eli Manning, he was able to waltz into the end zone. Almost literally: Bradshaw had a moment of indecisiveness as he reached the one-yard line, but soon backed into the end zone for the touchdown.
Even before that play began, color commentator Cris Collinsworth had already suggested that the Patriots might permit the Giants to score easily, because the Patriots only had one timeout remaining. They would therefore be able to stop the clock after second down, but not after third down. Since the play clock starts at forty seconds once the ball is set, the Giants would attempt a field goal on fourth down with only about ten to fifteen seconds remaining on the game clock. Collinsworth reasonably contended that the Patriots should prefer trying to score a touchdown with a minute left (plus their one timeout) over trying to score a field goal with ten to fifteen seconds left (without any timeouts).
(It's worth pointing out that then-Packers coach Mike Holmgren had been roundly criticized for making a similar tactical decision fourteen years earlier, in Super Bowl XXXII against the Broncos. Times change.)
And now, once Bradshaw had scored, Collinsworth decried Bradshaw's touchdown as a tactical error. Well, setting aside the tendency of sports broadcasters to exaggerate practically anything, was it a tactical error? Which outcome is better for each team?
Well, first of all, there's the intuitive argument that if one team wants you to do something, then your best strategy ought to be to resist that. So if the Patriots are parting the Red Sea, maybe your best bet is to lie down. And indeed, the Giants had considered that. Manning later reported that he was telling Bradshaw to go down in the field of play. The Patriots, for their part, said that it was immaterial, that they would have shoved Bradshaw into the end zone, but that tactic would not have worked if Bradshaw had taken a knee: Any subsequent bump by a defender, even the lightest touch, would have made Bradshaw down by contact at the one-yard line.
But let's not let psychological ploys decide the question. Which tactical choice is the right one here?
The Patriots have two choices—allow the touchdown, or play straightforward defense—but there are more than two possible outcomes. If the Patriots play defense, there are still multiple possibilities:
On the other hand, if the Patriots play defense, then there are those four possibilities:
Now let's take a look at those last two cases. If they don't score on second or third down, the remaining possibilities are a turnover, a missed field goal, or a made field goal. Out of those, I'd guess the made field goal happens nineteen times out of twenty. In the remaining cases, the Patriots just have to sit on the ball, which I'd also guess would happen nineteen times out of twenty (remember, they might have to avoid the safety). So the question roughly boils down to, which is more likely: Scoring a touchdown in a minute, or one of the following happening—scoring a field goal in ten to fifteen seconds, securing a turnover, or the Giants missing a field goal? If it's the touchdown, the Patriots should let the Giants score. If it's any of the remaining three choices, they should play straightforward defense.
Given that Lawrence Tynes hadn't missed a field goal of thirty yards or less in forever, the Giants were going to play possession football, and the Patriots would have no timeouts left for a field goal attempt, I'd go with letting them score, just as Belichick did. But there's no way this is a foregone conclusion. Sometimes, it's just a close call.
I'm speaking, of course, of the decision on the part of Patriots coach Bill Belichick to permit the Giants to score on second down and goal from the Patriots' six-yard line, with about a minute left in the game. The Patriots put up only token defense, so that when Ahmad Bradshaw took the handoff from Eli Manning, he was able to waltz into the end zone. Almost literally: Bradshaw had a moment of indecisiveness as he reached the one-yard line, but soon backed into the end zone for the touchdown.
Even before that play began, color commentator Cris Collinsworth had already suggested that the Patriots might permit the Giants to score easily, because the Patriots only had one timeout remaining. They would therefore be able to stop the clock after second down, but not after third down. Since the play clock starts at forty seconds once the ball is set, the Giants would attempt a field goal on fourth down with only about ten to fifteen seconds remaining on the game clock. Collinsworth reasonably contended that the Patriots should prefer trying to score a touchdown with a minute left (plus their one timeout) over trying to score a field goal with ten to fifteen seconds left (without any timeouts).
(It's worth pointing out that then-Packers coach Mike Holmgren had been roundly criticized for making a similar tactical decision fourteen years earlier, in Super Bowl XXXII against the Broncos. Times change.)
And now, once Bradshaw had scored, Collinsworth decried Bradshaw's touchdown as a tactical error. Well, setting aside the tendency of sports broadcasters to exaggerate practically anything, was it a tactical error? Which outcome is better for each team?
Well, first of all, there's the intuitive argument that if one team wants you to do something, then your best strategy ought to be to resist that. So if the Patriots are parting the Red Sea, maybe your best bet is to lie down. And indeed, the Giants had considered that. Manning later reported that he was telling Bradshaw to go down in the field of play. The Patriots, for their part, said that it was immaterial, that they would have shoved Bradshaw into the end zone, but that tactic would not have worked if Bradshaw had taken a knee: Any subsequent bump by a defender, even the lightest touch, would have made Bradshaw down by contact at the one-yard line.
But let's not let psychological ploys decide the question. Which tactical choice is the right one here?
The Patriots have two choices—allow the touchdown, or play straightforward defense—but there are more than two possible outcomes. If the Patriots play defense, there are still multiple possibilities:
- The Giants might score on second down anyway.
- Or they might score on third down.
- Or they might score a field goal on fourth down. (We'll assume they wouldn't try to score a touchdown.)
- Or they might fail to score at all, either because of a turnover or a missed field goal.
On the other hand, if the Patriots play defense, then there are those four possibilities:
- If the Giants score on second down, the Patriots still have to score a touchdown with about a minute remaining, and one timeout.
- If the Giants score on third down, the Patriots have to score a touchdown with about a minute remaining, but no timeouts.
- If the Giants score a field goal, the Patriots have to score a field goal with ten to fifteen seconds left, and no timeouts.
- If the Giants fail to score at all, the Patriots can simply run out the clock.
Now let's take a look at those last two cases. If they don't score on second or third down, the remaining possibilities are a turnover, a missed field goal, or a made field goal. Out of those, I'd guess the made field goal happens nineteen times out of twenty. In the remaining cases, the Patriots just have to sit on the ball, which I'd also guess would happen nineteen times out of twenty (remember, they might have to avoid the safety). So the question roughly boils down to, which is more likely: Scoring a touchdown in a minute, or one of the following happening—scoring a field goal in ten to fifteen seconds, securing a turnover, or the Giants missing a field goal? If it's the touchdown, the Patriots should let the Giants score. If it's any of the remaining three choices, they should play straightforward defense.
Given that Lawrence Tynes hadn't missed a field goal of thirty yards or less in forever, the Giants were going to play possession football, and the Patriots would have no timeouts left for a field goal attempt, I'd go with letting them score, just as Belichick did. But there's no way this is a foregone conclusion. Sometimes, it's just a close call.
Labels:
football,
Giants,
Patriots,
probability,
tactics
Thursday, February 2, 2012
The Philosophy of Probability
I've previously written about the importance of context in statistical analysis, though only as a prelude to saying something about game theory, and how it can influence one-dimensional performance measures in obscure ways. Well, hold onto your hats, because now I'm going to talk directly about context.
As a simple, frivolous example of context, consider the claim (possibly apocryphal—it's such a good story) of an ESP researcher who apparently did a comprehensive study with a large number of volunteers. A thousand or so, in fact. And he said that although most of the volunteers lacked any notable ESP talent, one in particular seemed to have it in spades. In fact, he said, he had done a statistical analysis on the results, and this volunteer had scored "at the three-sigma level."
What does that mean? It means that the results of his volunteers followed a bell-curve sort of distribution, which has a standard deviation. The standard deviation is a measure of the spread or width of the bell curve, and is denoted by the Greek letter sigma (σ). So a result that is 3σ above the average is very unusual indeed.
How unusual? Well, a 1σ result is high enough that we'd expect only one volunteer in six to score that high, just out of random chance. A 2σ result is high enough that we'd expect only one volunteer in forty to score that high. And a 3σ result is high enough that we'd expect only one volunteer in a thousand to score that high. So that must be significant, right? I mean, only one in a thousand volunteers could be expected to score that high by chance. Oh, except that there were a thousand volunteers...so...uhh, maybe it's not so significant after all.
OK, maybe that story is apocryphal; I couldn't find it in a Google link. (Except now that I've posted this story, I'll be able to find it—on my blog.) But there is this excellent xkcd comic, which makes exactly the same point.
The first time I was directly confronted with this was many years ago, when I was tutoring a family friend in probability and statistics. Her teacher had assigned her a worksheet, and one of the problems concerned airplane accidents:
When we say that something is statistically significant at a given probability level, that refers to something called the null hypothesis, a central notion in statistics, and the inspiration for the name of this blog. The exact interpretation of the null hypothesis depends on the kind of problem you're examining, but roughly speaking, it asserts that there is no correlation, that there is no effect to be measured, that everything observed is the result of random chance variation. One doesn't—in fact, can't—prove the null hypothesis; in a sense, it is not even really assumed. We just compare other hypotheses to it.
So, in this case, the null hypothesis is that US Airways is not in fact more likely to have accidents than any other airline, that each accident is equally likely to involve any of the airlines. What we do then is to compute the probability that the pattern observed—four out of seven accidents involving US Airways—if the null hypothesis is presumed for the sake of argument to hold. If the resulting probability is less than 5% (1 in 20), then the observation is statistically significant at the 5% level. If the resulting probability is less than 1%, then the observation is statistically significant at the 1% level. And so on.
Well, let's go through the exercise. If there are five airlines, and seven accidents, and each accident is equally likely to involve any one of the five airlines (we'll assume for the time being that no accident involves more than one of the airlines), the probability we want is given by the binomial theorem:
P[US Airways is involved in four of seven accidents] = C(7,4) (1/5)^4 (4/5)^3 = 0.0287-
Since the probability is 2.87% < 5%, the observation is significant at the 5% level. Even if you add in the probability that they're involved in five or more accidents out of the seven, that probability only swells to 3.33%, so it's still significant at the 5% level.
Or so the teacher claimed. I wasn't so sure. I would say it depends a lot on what you're trying to determine, and here we get into an area where statistics is as much philosophy as it is mathematics and science.
The question is, why are you asking this question about US Airways? Is it because you have some other, material reason for doubting the safety of their flights? Or is it just the fact that four out of seven accidents involved them? This may seem like arguing about the number of angels that can dance on the head of a pin, but in truth, your approach to the question depends vitally on which it is. If it's because you have some other reason for suspecting US Airways—say, that they have shoddy maintenance records—then your line of reasoning is perfectly valid.
But if it's just the latter—if it's just a matter of noticing a cluster of US Airways accidents—then any airline at all might be the target of such a statistical analysis. We should then be asking what the probability is that any of the five airlines was involved in four (or more) of seven accidents. Since only one airline can be involved in four or more of the accidents, we can determine that probability very simply, by multiplying the single-airline probability by five, in which case we get 16.66%, which is decidedly not statistically significant. There's actually a one-in-six chance that some airline would be involved in four or more of seven accidents.
Why, that's only 1σ! Big deal!
As a simple, frivolous example of context, consider the claim (possibly apocryphal—it's such a good story) of an ESP researcher who apparently did a comprehensive study with a large number of volunteers. A thousand or so, in fact. And he said that although most of the volunteers lacked any notable ESP talent, one in particular seemed to have it in spades. In fact, he said, he had done a statistical analysis on the results, and this volunteer had scored "at the three-sigma level."
What does that mean? It means that the results of his volunteers followed a bell-curve sort of distribution, which has a standard deviation. The standard deviation is a measure of the spread or width of the bell curve, and is denoted by the Greek letter sigma (σ). So a result that is 3σ above the average is very unusual indeed.
How unusual? Well, a 1σ result is high enough that we'd expect only one volunteer in six to score that high, just out of random chance. A 2σ result is high enough that we'd expect only one volunteer in forty to score that high. And a 3σ result is high enough that we'd expect only one volunteer in a thousand to score that high. So that must be significant, right? I mean, only one in a thousand volunteers could be expected to score that high by chance. Oh, except that there were a thousand volunteers...so...uhh, maybe it's not so significant after all.
OK, maybe that story is apocryphal; I couldn't find it in a Google link. (Except now that I've posted this story, I'll be able to find it—on my blog.) But there is this excellent xkcd comic, which makes exactly the same point.
The first time I was directly confronted with this was many years ago, when I was tutoring a family friend in probability and statistics. Her teacher had assigned her a worksheet, and one of the problems concerned airplane accidents:
A survey of U.S. airplane accidents included seven major accidents involving fatalities. Although the survey covered five airlines, four of the accidents involved a single airline: US Airways. Is this statistically significant at the 5% level?(There may really have been such a survey: There was a period in the early-to-mid 1990s in which US Airways did in fact have a slew of major accidents.) Now, I should say something about that last sentence, because it sounds like we're asking what the probability is that the observation is just a result of random chance, and whether that probability is less than 5 percent. But that's not actually quite right.
When we say that something is statistically significant at a given probability level, that refers to something called the null hypothesis, a central notion in statistics, and the inspiration for the name of this blog. The exact interpretation of the null hypothesis depends on the kind of problem you're examining, but roughly speaking, it asserts that there is no correlation, that there is no effect to be measured, that everything observed is the result of random chance variation. One doesn't—in fact, can't—prove the null hypothesis; in a sense, it is not even really assumed. We just compare other hypotheses to it.
So, in this case, the null hypothesis is that US Airways is not in fact more likely to have accidents than any other airline, that each accident is equally likely to involve any of the airlines. What we do then is to compute the probability that the pattern observed—four out of seven accidents involving US Airways—if the null hypothesis is presumed for the sake of argument to hold. If the resulting probability is less than 5% (1 in 20), then the observation is statistically significant at the 5% level. If the resulting probability is less than 1%, then the observation is statistically significant at the 1% level. And so on.
Well, let's go through the exercise. If there are five airlines, and seven accidents, and each accident is equally likely to involve any one of the five airlines (we'll assume for the time being that no accident involves more than one of the airlines), the probability we want is given by the binomial theorem:
P[US Airways is involved in four of seven accidents] = C(7,4) (1/5)^4 (4/5)^3 = 0.0287-
Since the probability is 2.87% < 5%, the observation is significant at the 5% level. Even if you add in the probability that they're involved in five or more accidents out of the seven, that probability only swells to 3.33%, so it's still significant at the 5% level.
Or so the teacher claimed. I wasn't so sure. I would say it depends a lot on what you're trying to determine, and here we get into an area where statistics is as much philosophy as it is mathematics and science.
The question is, why are you asking this question about US Airways? Is it because you have some other, material reason for doubting the safety of their flights? Or is it just the fact that four out of seven accidents involved them? This may seem like arguing about the number of angels that can dance on the head of a pin, but in truth, your approach to the question depends vitally on which it is. If it's because you have some other reason for suspecting US Airways—say, that they have shoddy maintenance records—then your line of reasoning is perfectly valid.
But if it's just the latter—if it's just a matter of noticing a cluster of US Airways accidents—then any airline at all might be the target of such a statistical analysis. We should then be asking what the probability is that any of the five airlines was involved in four (or more) of seven accidents. Since only one airline can be involved in four or more of the accidents, we can determine that probability very simply, by multiplying the single-airline probability by five, in which case we get 16.66%, which is decidedly not statistically significant. There's actually a one-in-six chance that some airline would be involved in four or more of seven accidents.
Why, that's only 1σ! Big deal!
Friday, January 27, 2012
Shot Selection and the Secretary Problem
One of my favorite problems in all of recreational mathematics is the so-called secretary problem. In this problem, you are interviewing a hundred candidates for a secretarial position. For the purposes of discussion, we'll assume that the various candidates have a precise suitability rating, and of course, you want to maximize this rating for your hire. Ideally, then, you'd interview all hundred candidates first, get their ratings, and then hire the best one.
Unfortunately, that's not the way things work in this problem. You only get the candidates one at a time, and you have to decide then and there whether or not to hire them or not. Once you've rejected a candidate, they're lost to you forever. One could, theoretically, lose the best candidate on the very first interview.
The question then is, what is your best strategy, and what is your probability of making the best possible hire using that strategy?
(By the way, if you think this problem is formulated in a politically incorrect way, I first encountered it in a form called the sultan's dowry, in which a suitor for the sultan's daughters had to select the one with the largest dowry. If he picked the right one, he got to marry her, but if he didn't—well, let's just say an unsuccessful suitor and his head are soon parted. But it wasn't entirely unproductive; I eventually formulated a variation called the iterated sultan's dowry, in which a second suitor, seeing the first suitor's unsuccessful head roll down the hill, gets to use the information in choosing a prospective mate, and then the third, the fourth, etc. This variation has an interesting solution which is unfortunately too large to fit into this parenthetical comment.)
It can be shown, fairly easily, that the best strategy must be of the form "Skip the first n candidates, recording their suitability ratings. Then choose the next candidate whose rating exceeds theirs." The reason is that as you plow through the candidates, the probability that the best one is yet to come never increases, whereas the probability that you've already encountered the best one never decreases. So the question reduces to figuring out what the right choice for n is.
Ultimately, following a strategy like this, you could end up choosing no candidate at all if the best candidate is already in the first n, since you've already skipped all of those. But if you do pick a candidate, it will be number k > n.
For that one to be the best overall, the best must be in the last 100-n. Furthermore, the second best of the first k (that is, the best before the ultimate choice) must belong in the first n. Now, let's work out the probability that both of these happen. In order to do this, we have to break down the possibilities into all the different cases.
The first case is that the best candidate is the very next one—candidate number n+1—which happens with probability 1/100. You'll choose that one provided that there's no other candidate between the first n and number n+1 that is better than the first n. Since there are no candidates in between, that probability is 1. So the incremental probability for this case is 1/100 times 1, or just 1/100.
The second case is that the best candidate is the one after that—candidate number n+2—which again happens with probability 1/100. You'll choose that one provided that there's no candidate between the first n and number n+2 that is better than the first one. That will be true provided the best of the first n+1 happens within the first n, so the incremental probability for this case is 1/100 times n/(n+1).
Following the same line of reasoning, the third case—that the best candidate is number n+3—provides an incremental probability of 1/100 times n/(n+2), the fourth case provides an incremental probability of 1/100 times n/(n+3), etc., until the last case—that the best candidate is number 100—provides an incremental probability of 1/100 times n/99.
Putting all these cases together, this strategy "wins" with probability
n/100 × [1/n + 1/(n+1) + 1/(n+2) + · · · + 1/99]
It can be shown, using relatively straightforward calculus, that this expression reaches a maximum when n = 37, and yields a probability of success of about 0.37.
That's not a coincidence, incidentally. For large candidate pools (and a hundred candidates qualify as a large pool), of size N, the best strategy is to skip the first n = N/e, where e = 2.71828+ is the base of the natural logarithm, and to take the earliest best candidate thereafter. The approximate probability of success (that is, choosing the very best candidate of them all) is very close to 1/e = 0.36787+.
For a lot of people (including myself), that's rather stunning. It implies that even if you have a million candidates, you have a strategy that picks the very best one of them with better than a one-in-three chance.
The reason I'm putting basketball in the mix is that there's a fairly straightforward application to a vital aspect of scoring: shot selection.
Consider: A possession in basketball lasts for anywhere from 0 to 24 seconds (neglecting offensive rebounds). You can't always guarantee that you'll make the shot; the next best thing (at least before the endgame) is to select the very best shot—that is, the shot that has the best probability of going in (neglecting fouls and three-point shots).
In other words, ahem, optimal shot selection.
But you don't always know when that best shot is going to come, especially when you're working out of a halfcourt set. Is it the very first one? Is it the next best one? Maybe the best one will come at least twenty seconds into the possession. You just don't know. But maybe, now, you have a rule of thumb for selecting that best shot. You skip the ones that come in the first 24/e = 9 seconds (approximately), and take the next best one that comes thereafter.
Obviously, this rule makes lots of assumptions, such as (a) the defense is equally tenacious across the entire possession, (b) the offense is equally productive of shot opportunities across the entire possession, (c) the best shot opportunity is equally likely to come at any time during the entire possession, etc. But to the limited extent that these assumptions are approximately valid, it's not a bad rule of thumb. It suggests that the Phoenix Suns of the early-to-mid-2000s were a bit hasty.
But not by much. Just a second or two.
Unfortunately, that's not the way things work in this problem. You only get the candidates one at a time, and you have to decide then and there whether or not to hire them or not. Once you've rejected a candidate, they're lost to you forever. One could, theoretically, lose the best candidate on the very first interview.
The question then is, what is your best strategy, and what is your probability of making the best possible hire using that strategy?
(By the way, if you think this problem is formulated in a politically incorrect way, I first encountered it in a form called the sultan's dowry, in which a suitor for the sultan's daughters had to select the one with the largest dowry. If he picked the right one, he got to marry her, but if he didn't—well, let's just say an unsuccessful suitor and his head are soon parted. But it wasn't entirely unproductive; I eventually formulated a variation called the iterated sultan's dowry, in which a second suitor, seeing the first suitor's unsuccessful head roll down the hill, gets to use the information in choosing a prospective mate, and then the third, the fourth, etc. This variation has an interesting solution which is unfortunately too large to fit into this parenthetical comment.)
It can be shown, fairly easily, that the best strategy must be of the form "Skip the first n candidates, recording their suitability ratings. Then choose the next candidate whose rating exceeds theirs." The reason is that as you plow through the candidates, the probability that the best one is yet to come never increases, whereas the probability that you've already encountered the best one never decreases. So the question reduces to figuring out what the right choice for n is.
Ultimately, following a strategy like this, you could end up choosing no candidate at all if the best candidate is already in the first n, since you've already skipped all of those. But if you do pick a candidate, it will be number k > n.
For that one to be the best overall, the best must be in the last 100-n. Furthermore, the second best of the first k (that is, the best before the ultimate choice) must belong in the first n. Now, let's work out the probability that both of these happen. In order to do this, we have to break down the possibilities into all the different cases.
The first case is that the best candidate is the very next one—candidate number n+1—which happens with probability 1/100. You'll choose that one provided that there's no other candidate between the first n and number n+1 that is better than the first n. Since there are no candidates in between, that probability is 1. So the incremental probability for this case is 1/100 times 1, or just 1/100.
The second case is that the best candidate is the one after that—candidate number n+2—which again happens with probability 1/100. You'll choose that one provided that there's no candidate between the first n and number n+2 that is better than the first one. That will be true provided the best of the first n+1 happens within the first n, so the incremental probability for this case is 1/100 times n/(n+1).
Following the same line of reasoning, the third case—that the best candidate is number n+3—provides an incremental probability of 1/100 times n/(n+2), the fourth case provides an incremental probability of 1/100 times n/(n+3), etc., until the last case—that the best candidate is number 100—provides an incremental probability of 1/100 times n/99.
Putting all these cases together, this strategy "wins" with probability
n/100 × [1/n + 1/(n+1) + 1/(n+2) + · · · + 1/99]
It can be shown, using relatively straightforward calculus, that this expression reaches a maximum when n = 37, and yields a probability of success of about 0.37.
That's not a coincidence, incidentally. For large candidate pools (and a hundred candidates qualify as a large pool), of size N, the best strategy is to skip the first n = N/e, where e = 2.71828+ is the base of the natural logarithm, and to take the earliest best candidate thereafter. The approximate probability of success (that is, choosing the very best candidate of them all) is very close to 1/e = 0.36787+.
For a lot of people (including myself), that's rather stunning. It implies that even if you have a million candidates, you have a strategy that picks the very best one of them with better than a one-in-three chance.
The reason I'm putting basketball in the mix is that there's a fairly straightforward application to a vital aspect of scoring: shot selection.
Consider: A possession in basketball lasts for anywhere from 0 to 24 seconds (neglecting offensive rebounds). You can't always guarantee that you'll make the shot; the next best thing (at least before the endgame) is to select the very best shot—that is, the shot that has the best probability of going in (neglecting fouls and three-point shots).
In other words, ahem, optimal shot selection.
But you don't always know when that best shot is going to come, especially when you're working out of a halfcourt set. Is it the very first one? Is it the next best one? Maybe the best one will come at least twenty seconds into the possession. You just don't know. But maybe, now, you have a rule of thumb for selecting that best shot. You skip the ones that come in the first 24/e = 9 seconds (approximately), and take the next best one that comes thereafter.
Obviously, this rule makes lots of assumptions, such as (a) the defense is equally tenacious across the entire possession, (b) the offense is equally productive of shot opportunities across the entire possession, (c) the best shot opportunity is equally likely to come at any time during the entire possession, etc. But to the limited extent that these assumptions are approximately valid, it's not a bad rule of thumb. It suggests that the Phoenix Suns of the early-to-mid-2000s were a bit hasty.
But not by much. Just a second or two.
Labels:
basketball,
probability,
recreational mathematics,
Suns
Thursday, January 5, 2012
Constructions with Compass, Straightedge, and Flatiron
There may not be many of you out there who remember high-school geometry fondly, but those of you who do probably share my appreciation for compass-and-straightedge constructions. I (re-)started thinking about this today, when someone mentioned, jocularly, preparing for travel at Warp Cube Root of Two, and it occurred to me that this was one of the three classical things that one cannot do with compass and straightedge.Those strangely intermediate folks who are at once unfamiliar with compass-and-straightedge constructions and yet not intimidated by their spectre may find the Wikipedia page a reasonable starting place. But what I'm proposing today is something different, which I'm a bit startled hasn't been discussed more prominently: three-dimensional constructions. I've had this in my virtual back pocket for a while; this seems as good an opportunity as any to pull it out for a looksee, and figure out if anyone else has encountered anything like this.
The idea here is to extend to three dimensions what ordinary compass-and-straightedge constructions do in two dimensions. The first thing is to define the tools and rules for their use. For instance, in two dimensions, the tools are a compass and straightedge (like a ruler, but with only one edge and no markings), and with them, one may:
- Draw a line between any two distinct points.
- Draw a circle with one point as the center, and any other point on its circumference.
- Draw an arbitrary point on a line or a circle, or off it.
- Draw the point at the intersection of two lines (if they intersect).
- Draw the point (or two) at the intersection of two circles (if they intersect).
- Draw the point (or two) at the intersection of a line and a circle (if they intersect).
It turns out that these are impossible, and can be proved to be so, using some notions from field theory. That has not, of course, stopped people from submitting reams upon reams of alleged constructions of one of these three objects, all of which (you may be assured) are somewhere bogus.
But enough of that for now. In three dimensions, the canvas is not a flat plane, as it is in two dimensions, but all of space. And we introduce a new tool, which I will call a flatiron, which permits you to draw planes. The flatiron rules are as follows; in addition to the above, one may:
- Draw the unique plane containing any three non-collinear points.
- Draw a sphere with one point as the center, and any other point on its surface.
- Draw an arbitrary point on a plane or a sphere, or off it.
- Draw the line at the intersection of two planes (if they intersect).
- Draw the circle (or point) at the intersection of two spheres (if they intersect).
- Draw the circle (or point) at the intersection of a plane and a sphere (if they intersect).
- Draw the point (or two) at the intersection of a line or circle with a plane or sphere (if they intersect).
- Draw spheres of radius PQ around both P and Q.
- Draw the circle C at the intersection of spheres P and Q.
- Draw R, an arbitrary point on circle C.
- Draw a sphere of radius PR around R.
- Draw S, one of the two points of intersection between circle C and sphere R. PQRS is then a regular tetrahedron.
So, a couple of questions, one easy, and one not so easy:
- Suppose that indeed, the cube root of two is not constructible in three dimensions. What about the fourth root of two? In which dimensions might that be constructible?
- Is it possible to construct all five of the regular polyhedra? In addition to the tetrahedron and cube, these include the regular octahedron (eight faces), dodecahedron (twelve faces), and icosahedron (twenty faces).
Subscribe to:
Posts (Atom)





















