Wednesday, February 15, 2012

Slip Sliding Away

Here's a counter-sliding game I came up with a while back, while visiting my parents.

My parents have these small flags of various countries, which can be stood up, UN-style, because they're on flagpoles that are stuck into circular bases.  The flags can be removed from the bases to be waved, and when they are, you're left with just the circular bases.  One day, while idly sliding them around the table, I thought about using them for various geometrical exercises.  Of course, if one doesn't have small circular flagpole bases, one can use any kind of equally sized circular tokens; any kind of circular coin should work just fine.

The rules I set up for myself were as follows:
  1. One starts out with two touching counters.  This counts as two moves.  (For "historical reasons.")
  2. On any subsequent move, one may add a counter; this counter must touch two existing counters on the table.  (There is an exception, which I will mention later, in connection with an outstanding puzzle.)
  3. Or, one may remove a counter.  One must remove the counter by sliding it, however, not by lifting it up off the table.
The following picture shows an example.

Here, counters 1 and 2 are placed first.  One may then place counters 3, 4, and 5 in that order.  Removing counters 3 and 4 then leaves a straight line of three counters.  One could not construct that straight line directly, by just putting down counters 1, 2, and 5, because counter 5 would not have been placed in contact with two counters.

One could continue twice more around counter 2, creating a filled hexagon of seven counters.  If, however, one wanted to create a hollow hexagon, one would have to remove that middle counter at some point.  It seems tempting to place one more counter below counters 2 and 5, and then remove counter 2 to place at the last corner of the hexagon, but the following diagram shows why that won't work:


The space between the two counters is not wide enough to fit the center counter through (in fact, that space has a width only √3 - 1 = 0.732+ times as wide as necessary), so it cannot be slid out in accordance to Rule 3, above.  You might like to see if you can figure out a solution for the hollow hexagon before reading on.

The trick is to set up support for the fifth corner first, then slide out the center counter to become the fifth corner; the sixth corner is then easily slid into place.  Begin by placing six counters in a parallelogram arrangement:

Then slide counter 2 to touch counters 4 and 6:

Now slide counter 4 into the place previously occupied by counter 2:

Finally, slide counter 1 around to touch counters 2 and 4, at the sixth and last corner of the hexagon.  Voilà!

I leave you with two puzzles, one fairly simple, and one open (that is, unsolved):
  1. Follow the above rules to construct a hollow triangle of side 4 (just like the arrangement in ten-pin bowling, but without the center pin), in as few moves as possible.  There is more than one solution.
  2. Suppose we add an exception to Rule 2, above: We permit a counter to be placed in an arbitrary location on the table, but with the proviso that no required property of the final arrangement can depend on the exact location of that counter.  (For instance, a construction of a rectangle that depends on a counter being placed somewhere between 1 and 2 counter widths away from another is OK, but one that depends on it being placed exactly 1-1/2 counter widths away is not.)  In that case, is it possible to construct a perfect square of four counters, of any side length?  The sides of the square need not be filled in with any counters.

Monday, February 13, 2012

Weighted Fair Division

I'm sure this is an old puzzle somewhere in the world, but it came upon us a few years ago here at work in connection with driving to lunch.

Where I work, the company provides a cafeteria where one may purchase lunch.  Unfortunately, the lunch is either too expensive or not good enough, depending on your point of view, so we generally eat out.  We're lucky that we can do that.  Anyway, in general, we try to take turns driving so that we're all likely to drive about equally often.  It doesn't always work out that way, but that's the aim.

If we all ate out every meal, it'd be simple; we'd all drive with equal probability.  But what happens if, as has been the case occasionally throughout the years I've worked here, one of us can only eat out once per week?  How often should that person drive, when they do eat out?

To make things simpler, let's assume that there are two of us daily diner (five times per week), and one single-day diner.  Four days out of the week, there are only two diners.  If each one drives one-half of the time, then both of them end up driving two days out of the four.

On the last remaining day of the week, when there are three diners, should each drive one-third of the time?  Well, if we do things that way, then each of the two diners drives 2-1/3 days, on average, whereas the single-day diner drives just 1/3 day per week, on average.  That's not fair, because the two daily diners drive seven times as much as the single-day diner, even though they only eat five times as often.  The single-day diner should have to shoulder more of the driving burden on that one day.

Let's denote by p the probability that the single-day diner drives on that day.  Then the two daily diners drive on that day with probability (1-p)/2, and over the course of the week, they drive (5-p)/2 days, on average.  According to our fairness metric, we must find p such that (5-p)/2 = 5p, which yields

5-p = 10p

11p = 5

p = 5/11

So the single-day diner should drive nearly half of the time, on those days when he or she joins the two daily diners.  By a similar line of reasoning, if there are three daily diners, that probability drops to 5/16, and in general, with n daily diners, the probability is 5/(1+5n), with each of the daily diners driving five times more often, or 25/(1+5n).

What happens if there are m one-day diners (each of them eating on the same day)?  Then the probability p that any of the one-day diners should drive on that one day drops even further, to 5/(m+5n).

One might well consider (providing one is still reading) extending these to k-day diners, and whether the results depend on the k-day diners eating on the same k days, or if the results are insensitive to the distribution of those k days.

Tuesday, February 7, 2012

Roll Over, You Pats!

This past weekend's Super Bowl XLVI (that's forty-six) provided yet another confluence of probability, tactics, and sports.  That's never a bad thing.

I'm speaking, of course, of the decision on the part of Patriots coach Bill Belichick to permit the Giants to score on second down and goal from the Patriots' six-yard line, with about a minute left in the game.  The Patriots put up only token defense, so that when Ahmad Bradshaw took the handoff from Eli Manning, he was able to waltz into the end zone.  Almost literally: Bradshaw had a moment of indecisiveness as he reached the one-yard line, but soon backed into the end zone for the touchdown.

Even before that play began, color commentator Cris Collinsworth had already suggested that the Patriots might permit the Giants to score easily, because the Patriots only had one timeout remaining.  They would therefore be able to stop the clock after second down, but not after third down.  Since the play clock starts at forty seconds once the ball is set, the Giants would attempt a field goal on fourth down with only about ten to fifteen seconds remaining on the game clock.  Collinsworth reasonably contended that the Patriots should prefer trying to score a touchdown with a minute left (plus their one timeout) over trying to score a field goal with ten to fifteen seconds left (without any timeouts).

(It's worth pointing out that then-Packers coach Mike Holmgren had been roundly criticized for making a similar tactical decision fourteen years earlier, in Super Bowl XXXII against the Broncos.  Times change.)

And now, once Bradshaw had scored, Collinsworth decried Bradshaw's touchdown as a tactical error.  Well, setting aside the tendency of sports broadcasters to exaggerate practically anything, was it a tactical error?  Which outcome is better for each team?

Well, first of all, there's the intuitive argument that if one team wants you to do something, then your best strategy ought to be to resist that.  So if the Patriots are parting the Red Sea, maybe your best bet is to lie down.  And indeed, the Giants had considered that.  Manning later reported that he was telling Bradshaw to go down in the field of play.  The Patriots, for their part, said that it was immaterial, that they would have shoved Bradshaw into the end zone, but that tactic would not have worked if Bradshaw had taken a knee: Any subsequent bump by a defender, even the lightest touch, would have made Bradshaw down by contact at the one-yard line.

But let's not let psychological ploys decide the question.  Which tactical choice is the right one here?

The Patriots have two choices—allow the touchdown, or play straightforward defense—but there are more than two possible outcomes.  If the Patriots play defense, there are still multiple possibilities:
  • The Giants might score on second down anyway.
  • Or they might score on third down.
  • Or they might score a field goal on fourth down.  (We'll assume they wouldn't try to score a touchdown.)
  • Or they might fail to score at all, either because of a turnover or a missed field goal.
If the Patriots allow the touchdown, and we assume for the time being that the Giants don't refuse that touchdown, then the Patriots would have to score a touchdown in about a minute, with one timeout remaining.  Let's say they're able to do that with some probability qTD.

On the other hand, if the Patriots play defense, then there are those four possibilities:
  • If the Giants score on second down, the Patriots still have to score a touchdown with about a minute remaining, and one timeout.
  • If the Giants score on third down, the Patriots have to score a touchdown with about a minute remaining, but no timeouts.
  • If the Giants score a field goal, the Patriots have to score a field goal with ten to fifteen seconds left, and no timeouts.
  • If the Giants fail to score at all, the Patriots can simply run out the clock.
If the Giants score on second or third down against straightforward defense, the Patriots are left in pretty much the same situation as if they just let them score on second down, modulo that timeout.  So as it stands, they're just a bit worse off if they play defense.

Now let's take a look at those last two cases.  If they don't score on second or third down, the remaining possibilities are a turnover, a missed field goal, or a made field goal.  Out of those, I'd guess the made field goal happens nineteen times out of twenty.  In the remaining cases, the Patriots just have to sit on the ball, which I'd also guess would happen nineteen times out of twenty (remember, they might have to avoid the safety).  So the question roughly boils down to, which is more likely: Scoring a touchdown in a minute, or one of the following happening—scoring a field goal in ten to fifteen seconds, securing a turnover, or the Giants missing a field goal?  If it's the touchdown, the Patriots should let the Giants score.  If it's any of the remaining three choices, they should play straightforward defense.

Given that Lawrence Tynes hadn't missed a field goal of thirty yards or less in forever, the Giants were going to play possession football, and the Patriots would have no timeouts left for a field goal attempt, I'd go with letting them score, just as Belichick did.  But there's no way this is a foregone conclusion.  Sometimes, it's just a close call.

Thursday, February 2, 2012

The Philosophy of Probability

I've previously written about the importance of context in statistical analysis, though only as a prelude to saying something about game theory, and how it can influence one-dimensional performance measures in obscure ways.  Well, hold onto your hats, because now I'm going to talk directly about context.

As a simple, frivolous example of context, consider the claim (possibly apocryphal—it's such a good story) of an ESP researcher who apparently did a comprehensive study with a large number of volunteers.  A thousand or so, in fact.  And he said that although most of the volunteers lacked any notable ESP talent, one in particular seemed to have it in spades.  In fact, he said, he had done a statistical analysis on the results, and this volunteer had scored "at the three-sigma level."

What does that mean?  It means that the results of his volunteers followed a bell-curve sort of distribution, which has a standard deviation.  The standard deviation is a measure of the spread or width of the bell curve, and is denoted by the Greek letter sigma (σ).  So a result that is 3σ above the average is very unusual indeed.

How unusual?  Well, a 1σ result is high enough that we'd expect only one volunteer in six to score that high, just out of random chance.  A 2σ result is high enough that we'd expect only one volunteer in forty to score that high.  And a 3σ result is high enough that we'd expect only one volunteer in a thousand to score that high.  So that must be significant, right?  I mean, only one in a thousand volunteers could be expected to score that high by chance.  Oh, except that there were a thousand, maybe it's not so significant after all.

OK, maybe that story is apocryphal; I couldn't find it in a Google link.  (Except now that I've posted this story, I'll be able to find iton my blog.)  But there is this excellent xkcd comic, which makes exactly the same point. 

The first time I was directly confronted with this was many years ago, when I was tutoring a family friend in probability and statistics.  Her teacher had assigned her a worksheet, and one of the problems concerned airplane accidents:
A survey of U.S. airplane accidents included seven major accidents involving fatalities.  Although the survey covered five airlines, four of the accidents involved a single airline: US Airways.  Is this statistically significant at the 5% level?
(There may really have been such a survey: There was a period in the early-to-mid 1990s in which US Airways did in fact have a slew of major accidents.)  Now, I should say something about that last sentence, because it sounds like we're asking what the probability is that the observation is just a result of random chance, and whether that probability is less than 5 percent.  But that's not actually quite right.

When we say that something is statistically significant at a given probability level, that refers to something called the null hypothesis, a central notion in statistics, and the inspiration for the name of this blog.  The exact interpretation of the null hypothesis depends on the kind of problem you're examining, but roughly speaking, it asserts that there is no correlation, that there is no effect to be measured, that everything observed is the result of random chance variation.  One doesn'tin fact, can'tprove the null hypothesis; in a sense, it is not even really assumed.  We just compare other hypotheses to it.

So, in this case, the null hypothesis is that US Airways is not in fact more likely to have accidents than any other airline, that each accident is equally likely to involve any of the airlines.  What we do then is to compute the probability that the pattern observedfour out of seven accidents involving US Airwaysif the null hypothesis is presumed for the sake of argument to hold.  If the resulting probability is less than 5% (1 in 20), then the observation is statistically significant at the 5% level.  If the resulting probability is less than 1%, then the observation is statistically significant at the 1% level.  And so on.

Well, let's go through the exercise.  If there are five airlines, and seven accidents, and each accident is equally likely to involve any one of the five airlines (we'll assume for the time being that no accident involves more than one of the airlines), the probability we want is given by the binomial theorem:

P[US Airways is involved in four of seven accidents] = C(7,4) (1/5)^4 (4/5)^3 = 0.0287-

Since the probability is 2.87% < 5%, the observation is significant at the 5% level.  Even if you add in the probability that they're involved in five or more accidents out of the seven, that probability only swells to 3.33%, so it's still significant at the 5% level.

Or so the teacher claimed.  I wasn't so sure.  I would say it depends a lot on what you're trying to determine, and here we get into an area where statistics is as much philosophy as it is mathematics and science.

The question is, why are you asking this question about US Airways?  Is it because you have some other, material reason for doubting the safety of their flights?  Or is it just the fact that four out of seven accidents involved them?  This may seem like arguing about the number of angels that can dance on the head of a pin, but in truth, your approach to the question depends vitally on which it is.  If it's because you have some other reason for suspecting US Airways—say, that they have shoddy maintenance records—then your line of reasoning is perfectly valid.

But if it's just the latter—if it's just a matter of noticing a cluster of US Airways accidents—then any airline at all might be the target of such a statistical analysis.  We should then be asking what the probability is that any of the five airlines was involved in four (or more) of seven accidents.  Since only one airline can be involved in four or more of the accidents, we can determine that probability very simply, by multiplying the single-airline probability by five, in which case we get 16.66%, which is decidedly not statistically significant.  There's actually a one-in-six chance that some airline would be involved in four or more of seven accidents.

Why, that's only 1σ!  Big deal!