Thursday, October 22, 2009

Something to Do With Math, Right?

In my last post, I mentioned that scoring differential has been shown to be a better predictor of future wins than even past wins are. What this referred to, specifically, is the so-called Pythagorean expectation (PE), a creation of baseball statistics guru Bill James. It's called that because of the form of the PE formula: If you let RS be runs scored by the team, and RA be runs scored against the team, then a good estimator for the winning percentage—at least in baseball—is

WP = RS2 / (RS2 + RA2)

So, for instance, if over the course of a season a team scores 800 runs, but only gives up 600, then the PE formula predicts that their winning percentage will be about 8002 / (8002 + 6002) = 0.640.

Actually, there's nothing magical about the exponent 2 in this formula; as it turns out, an exponent of 1.81 matches actual winning percentage better than 2 does. What I'd like to do in this post is say a few words (well, who are we kidding here, more than a few words) about where this exponent comes from, and an interesting correlation.

Baseball, like any sport, can be treated like a combination of strategy, tactics, and random events. The strategy and tactics represent those things that are under the control of the two teams, while the random events are things that are out of their control, such as where the baseball hits the bat, how it bounces off the grass, and so forth. Technically, as I've said before, these aren't actually random, but they happen so quickly that they're essentially random for our purposes; we can't perfectly predict how they'll go. All we can do is assign probabilities: e.g., such-and-such a player will hit it up in the air 57 percent of the time, on the ground 43 percent of the time, stuff like that.

As a result, the outcome of games aren't perfectly predictable, either; as they say, that's why they play the games. Again, we can assign probabilities—probabilities that a team scores so many runs, or gives up so many runs, or that they win or lose a particular game. The PE formula is an attempt to relate the probability distribution of runs scored and runs given up, to the probability distribution of winning and losing.

The probability distribution can only be specified mathematically, but we can get an inkling of how it works by sketching it out schematically.


In the diagram above, the horizontal axis measures runs given up, and the vertical axis measures runs scored. The diagonal dotted line represents the positions along which the two measures are equal, so if you're above that line, you win the game, and if you're below it, you lose the game.

The red blob depicts the probability distribution of runs scored and given up for a hypothetical team. Each point within the blob represents a possible game outcome. Games in the lower left are pitcher's duels, while those in the upper right are shootouts. Those in the other corners are games in which the team either blew out their opponent or were blown out themselves. Any outcome within the red blob is possible, but they're more likely to be clustered in the center of the blob, where it's a darker red. The particular way in which the games are clustered around that middle is known as the normal or Gaussian distribution. Such a distribution is predicted by something called the central limit theorem, and is also borne out by empirical studies.

From this diagram, we can estimate what the team's winning percentage is: It should be the fraction of all the red ink that shows up above the diagonal dotted line. Since the team scores, on average, a bit more than it gives up, more of the blob is above that line than below it, and their winning percentage should be somewhat above 0.500—say, 0.580, maybe. What Bill James found out was that if you compute the "red ink fraction" for a variety of different values of runs scored and runs given up, the results were essentially the same as those yielded by the formula given above.

Now, as it so happens, if you try to apply the same formula to, say, basketball, it doesn't work very well at all. Practically any team will end up with a predicted winning percentage between 0.450 and 0.550, and we know very well that isn't so: Usually there's at least one team over 0.750, and often times one over 0.800 (Cleveland did that this past season). The reason can be seen if we take a look at the corresponding "red ink" diagram for basketball.


Baseball scores runs, and basketball scores points, but the principle is the same. What isn't the same, however, is the degree of variation in the scores, relative to the total score. Basketball teams show much less variation in the number of points they score than baseball teams do. Basketball teams rarely score twice as much in one game as they do in any other; by comparison, baseball teams are occasionally shut out and occasionally score 10+ runs.

In consequence, a baseball team that scores 10 percent more runs than it gives up will still lose a fair number of games, because the variation in scores is much more than 10 percent a lot of the time. In contrast, a basketball team that scores 10 percent more points than it gives up will win a huge fraction of the time, because the variation in scoring is so much less. As you can see above, the red blob is in approximately the same place in both diagrams, but because the blob is smaller (less variation), practically all of the blob is now above the diagonal line, corresponding to a winning percentage of, oh, let's say 0.850.

This property can be addressed by using James's PE formula, but with a much higher exponent. Estimates vary as to how much higher, but the differences are relatively minor: Dean Oliver suggests using 14, whereas John Hollinger uses 16.5. Either of them will give a good prediction of the winning percentage of the applicable team.

It would be nice not to have to guess at the right exponent, though. So, since there seems to be a pretty obvious correlation between the size of the blob and the size of the exponent, I decided to investigate exactly what that correlation was. It seems likely that someone else has done it before, but a Web search didn't turn up any obvious results, so I'm sharing mine here.

To begin with, there's something else in statistics called the coefficient of variation, which basically gives in this case the size of the blob, relative to how far it is from either axis. In case you're following along on your own paper, it's defined as the ratio of the standard deviation of the distribution to the mean. So, in baseball, the c.v. is relatively large; and in basketball, it's relatively small.

What I did was to figure out, from numerical computations, what the "red ink" fraction was for various c.v.'s and scoring differentials, and to see if a formula of James's basic structure, with the right exponent, would fit those fractions. (My tool of choice was the free and open-source wxmaxima, in case you're interested.) They did, very well. In fact, I found it startling how well they fit, assuming that scoring was normally distributed. In most cases, the right exponent would fit winning percentages to within a tenth of a percent.

For instance, for a c.v. of 0.5, an exponent of 2.26 fit best. The numerical computation showed that a team that scored 20 percent more than it gave up would win 60.1 percent of the time; so did the formula. As the c.v. went down, the exponent went up, just as you would expect. The actual values:

c.v. = 0.5, exp = 2.26
c.v. = 0.3, exp = 3.78
c.v. = 0.2, exp = 5.67
c.v. = 0.1, exp = 11.7

I found these results startling: the product of c.v. and exp is almost constant, at about 1.134. (I propose calling this the Hell relation.) In other words, the right exponent is almost exactly inversely proportional to the c.v. of the scoring distribution. Therefore, we would predict that the c.v. of baseball games is 1.134/1.82, or 0.623; that of basketball would be 0.081 or 0.069, depending on whether you trust Oliver or Hollinger. I've heard that Houston Rockets GM Daryl Morey once determined an exponent of 2.34 for the NFL, which would correspond to a c.v. of 0.485.

Obviously, this is a consequence of the particular scoring model I used, but the normal distribution is broadly applicable to a lot of sports, most of which have games that are long enough to allow normalcy to show up. Given how well the basic structure of James's formula holds up, I suspect the underlying assumptions are fairly valid, although it would be interesting to see that verified.

EDIT: Here's an article from a statistics professor on just this very topic, with a rigorous derivation of the various formulae.

Monday, October 19, 2009

Adjusted Plus or Minus (More or Less)

I spent some time a while back discussing PER and its limitations. Today I'll take a similar look at adjusted plus-minus, or APM.

One of the weaknesses of PER is that it's a rather arbitrary linear combination of basketball statistics. As I pointed out, one can come up with alternate combinations that put any number of players on top of the PER list. In math nerd terms, any player on the convex hull of the statistics space can end up on top, given the right PER formula. With as many dimensions in that space as there are component statistics, that could end up being a lot of players.

And anyway, the bottom line of the game is winning, and there's no clear evidence that maximizing team PER (however you define that) maximizes your chances of winning. (It must be emphasized, by the way, that that's all any statistical approach can do: maximize chances. Basketball may be played on the floor, not on a piece of paper, but the small contingencies that lead to winning or losing are so complex and so numerous that the only thing we can do with them is treat them as essentially random events. Nothing is ever really certain in any practical sense.)

APM is a completely different approach to player assessment that attempts to remedy this weakness. Its purpose is to determine how much a player contributes to his team's scoring margin versus the opponents, which has been shown, to varying degrees of certainty, to be a good predictor of future winning percentage—better even than past winning percentage. It does this by calculating how much the team outscores its opponents with that player on the court. There's a few ways we could do this (just as there are multiple ways to define PER); I'll just be discussing one of them.

As its name implies, APM is an adjusted form of raw plus-minus, which we can call RPM for the moment. The difference between the two can best be illustrated using a simplified example. Suppose some Lakers players (Kobe, Pau, and Lamar) are participating in a two-on-two tournament, with substitutes allowed. Games are 48 minutes long. Let's say that in a particular game, Kobe and Pau open the game and play for 16 minutes, outscoring the opponents by 8. Pau and Lamar play the next 16 minutes, outscoring the opponents by just 2. Finally, Kobe and Lamar close the last 16 minutes, and outscore the opponents by 4. For the sake of simplicity, let's assume for now that the opponents have no sub and play the entire game with the same two players.

During the 32 minutes that Kobe's on the floor, his team outscores the opponents by a total of 12 points. Over a full 48-minute game, that would work out to a RPM of +18 (a 48-minute game is half again as long as Kobe's 32 minutes). Similarly, Pau's 48-minute RPM is +15, and Lamar's is +9.

However, you might ask, for instance, how much of Pau's RPM is due to his own contribution, and how much is due to sharing the court with Kobe? This is the question that APM seeks to answer. It attempts to account for the teammates one plays with, as well as the opponents one plays against (though we're keeping those constant for now).

One might compute the APMs of the three players as follows: Let Kobe's, Pau's, and Lamar's APM be represented by k, p, and l, respectively. From the first 16 minutes, we extrapolate that if Kobe and Pau played the entire game, they'd have outscored the opponents by 24 points. That could mean that both players have APMs of +24, or perhaps Kobe's is +28 and Pau's is +20, or maybe vice versa. There's not enough information to determine for sure. However, at any rate, they add up to 48:

k + p = 48

Similarly, we can write for the other two 16-minute segments

p + l = 12
k + l = 24

I'm not going to go through the gory algebra (I'm assuming you can do that yourself if you've read this far), but these three equations in three variables yield a unique solution: k = +30, p = +18, l = - 6. By way of interpretation, if you had two Kobes play against two average players for an entire game, the Kobes would win by 30 points. (Various versions of APM scale this so that you can just add up the APMs to determine the expected final winning margin. There's no significant difference between this and what we derived; they would just differ by a constant factor—the number of players—so that the scaled APMs would be +15, +9, and - 3, respectively.)

Note that nowhere in all of this computing did we say anything about scoring, rebounds, assists, steals, blocks, fouls, etc.—any of the statistics that make up aggregate parameters like PER. APM is entirely agnostic about what makes players valuable to their team; it simply measures that value. In a way, this is useful, because it completely short-circuits any assumptions about what makes players valuable in general; on the other hand, it sure would help if you knew why your player was valuable. APM can't really answer that. It is, in a very real sense, the holistic yin to PER's reductionistic yang.

Incidentally: What happens if the opponents do use different line-ups? Suppose the Lakers are playing the Magic, with Dwight Howard, Vince Carter, and Rashard Lewis. We'd use d, v, and r to represent their APMs, and assuming they played those line-ups in the same 16-minute segments as the Lakers did, we'd write out something like the following equations:

(k + p) - (d + v) = 48
(p + l) - (v + r) = 12
(k + l) - (d + r) = 24

Note that we now have three equations in six variables, which means that the scenario is said to be underdetermined: there won't be a unique solution to the equations, but multiple solutions (an infinite number, in fact). In general, there will be some kind of mathematical mismatch like this: There are as many variables as players, but as many equations as there are matchups, and those usually won't be equal. Since the number of matchups is larger than the number of players, though, you'll typically have overdetermined scenarios: there won't be any exact solutions at all; any combination of numbers will violate one equation or another.

That sounds bad, but in a sense, it's better than being underdetermined, because we can use statistical methods to determine the best near-solution to the equations—"best" in this case defined by how little the equations are violated as a whole. We can justify this by observing that players aren't robots—their performance varies up and down over the course of a game or a season—so some error in the equations is expected. Typically, the statistical method used is some form of linear regression, which is the same method used to identify likely correlations in all manner of scientific studies. In general, such methods work very well indeed.

I am, however, going to go off the reservation a little: I'm claiming that it might not work so well for basketball.

The key sticking point is hinted at by that name, linear regression, but it's present even in the deterministic case we worked out when Kobe, Pau, and Lamar were taking out their aggression on some hapless two-man team with a constant line-up. I said, for instance, that if Kobe and Pau both had APMs of +24, then they'd outscore the opponents, over an entire game, by those 24 points. Not so earthshattering; if they had in fact played the whole game, that's exactly the APM they'd have ended up with.

But then I also suggested that their APMs might be different: Kobe's could be higher and Pau's lower, or the other way around. And most crucially, I suggested that if one was higher, then the other must be lower by the same amount, so that they always add up to 48. In technical terms, we assume that APM combines linearly. That hidden assumption is part and parcel of the APM calculation; it is what allows us to make the determination that although Kobe's APM and Pau's could be any values individually, they must add up to 48. Without the linearity assumption, we can't write any equations at all; we can't compute APM, statistically or otherwise.

If you think about it, though, what justifies this addition of APMs? What makes us think that we can just add players willy-nilly, like numbers? I personally can't think of a thing that justifies that in anything close to a rigorous way. On the contrary, there's every possibility that they don't always add that way. If two players are both offensive powerhouses but defensive milquetoasts, they might both have good APMs because they spend all of their time playing with teammates that cover for their defensive weaknesses. Put them together, though, and since there's only one ball to score with, their collectively miserable defense might make them a net minus. (EDIT: Wayne Winston's version of APM, at the very least, tries to account for this. Look closely at Winston's answer to Question 5 here, and you'll see that his model includes an "interaction" factor that is a function of a pair of players. As a result, you have an affine relation instead of a linear one, and at least some of the first-order issues with linearity are taken care of.)

The linearity assumption is so seductive because it seems natural and jibes with lots of our experience. If I can grade 20 exams per hour, and you can too, then together we can grade 40 exams per hour. But in any endeavor that requires lots of teamwork and collaboration, the assumption becomes more tenuous. That doesn't unfortunately make it any less critical to the validity of things like APM. It simply has to be demonstrated for us to have any legitimate confidence in the value of APM; it isn't incumbent on anyone else to show that the linearity assumption doesn't hold, but for APM proponents to show that it does.

More insidiously, because linearity seems so natural, we are likely to miss its pivotal role in statistical measures like APM. Perhaps someone somewhere has done a study to validate the linearity assumption for APM. But if so, I haven't seen it, and I bet neither have most APM adherents. If you have, please share it!

Thursday, October 1, 2009

Inconsequence (A Jazz Tune)

Something a little different. A test of the video embedding, I guess. (Could it have picked a more objectionable thumbnail?)


An original composition. In my Walter Mitty fantasy world, this is part of a stage musical and is performed twice; the reprise has slightly different lyrics. For my own nefarious purposes, I have Frankensteined the two into one.

Here we are, you and I,
Face to face, eye to eye.
Shouldn't time give a soul
Who while wondering was blundering
A chance to be whole...?

...Hold that thought, just a mo,
Never mind, let it go.
Doesn't matter what we do
From here on, from here on I'll smile
In consequence of you.

This song is Copyright © 2009 by Brian Tung. All rights reserved. Product may have settled during shipping. Do not incinerate. Objects in mirror may be closer than they appear. Operate in a well-ventilated environment. Handle with care. Do not taunt Happy Fun Ball. Contents under pressure. Do not inhale.

Thursday, August 27, 2009

Stardust Memories

When I was ten, my dad took a couple of friends and me to see a movie. My friends and I had the choice of watching Rollercoaster, which was about a terrorist attempting to extort money from amusement parks by blowing up sections of rollercoaster track just as the coaster gets to them, or this new science fiction film that had recently opened and was getting good notices. As you've no doubt guessed, we chose poorly, while my dad went to the other film, which was (as you've probably also guessed) Star Wars. Meanwhile, one of my friends threw up on the car ride home.

I saw Star Wars in the theater four times, which to this date remains the last time I ever saw a film multiple times in the theater. Early in the film, right after the text crawl, but before the rebel ship comes on screen, you're treated to a view of a star field. In fact, here it is (click to enlarge):


When I saw the film again recently, there was something vaguely unsettling and unnatural about the look of the stars in this scene. For the sake of comparison, here's a real star field, with roughly the same level of detail (again, click to enlarge):


What strikes me now (although I was oblivious to it back in 1977, at least consciously) is how much more regular the star field is in the Star Wars frame than it is in the real photograph. There isn't much variation in the stars in the movie frame, with the top fifty or so being about the same brightness; in contrast (no pun intended), there are many more dim stars in the real photograph, and they fade out gradually, suggesting that there are plenty of stars that are in the field of view, but just beyond the limits of detectability, in this photograph at least. And there are, in fact. For some reason, that sense of infinity, which isn't in the movie frame, appeals to me greatly.

You can sort of see the reasoning behind this if you imagine for the moment that all stars are of the same intrinsic brightness, and that the only reason that some appear brighter and some appear dimmer is that they're closer or further away. (Sort of the way that most adults are of about the same height, but appear to be different sizes because they're at different distances.) And because there is more space far away than there is close up, there are more stars that are far away and therefore dim than there are stars that are close up and therefore bright.

Now, as it happens, stars do vary in actual brightness—sometimes dramatically—but the basic explanation still holds, and is supported by actual counts of bright stars versus dim stars. And I think that through long association with the night sky, we gain an appreciation for that kind of aesthetic. Once upon a time, every human on the planet with reasonably good vision had that association. Nowadays, it's less common. But the potential is still there within each of us, and in my case, it expressed itself in, among other things, my preference for the real star image rather than the Star Wars movie frame.

And this set me to wondering whether a sense for this kind of aesthetic could be mechanized in any way. In a very naïve way, it surely could. The way that the star counts vary by brightness follow a fairly well-understood formula, and a star field could easily be scanned for how well it matches that formula. But I think it's a common feeling that that would fall well short of a genuine sense of aesthetics. There would have to be a larger framework for that kind of aesthetic sense.

Could such a framework lie in fractals? Fractals are, generally speaking, patterns that are self-similar; that is, the appearance of the whole at a large scale is repeated in small parts of the pattern at smaller scales. Examples of fractals range from prosaic snowflake patterns:


to the sublime Mandelbrot set:


Fractals have been used to describe natural patterns as varied as the sound of wind through trees and the coastline of Great Britain. And they can be used to describe the appearance of star fields as well. A star field looks quite the same if you zoom in and increase the brightness. The details are different, so in that sense it is not quite like the snowflake fractal or even the Mandelbrot set. But statistically, the close-up shot and the wide-angle shot are essentially identical.

I cannot say exactly what it is about the "fractality" of these patterns that is appealing. And it does seem as though a certain sense of variation (absent in the snowflake, present to an extent in the Mandelbrot set, and rampant in real star fields) is vital to maintaining visual interest. But I can't escape the notion that self-similarity is something that people generally find captivating and inviting, once they recognize it, and is a large part of why looking up at the night sky is such a natural thing to do.

Seventh Night

Last night was Seventh Night (七夕), the seventh night of the seventh month in the lunisolar calendar followed traditionally by the Chinese. Because the Chinese calendar usually starts with the second new moon after the winter solstice, Seventh Night usually falls sometime in August in the western calendar.

Seventh Night is associated in Chinese tradition with the story of the Cowherd and the Weaver Girl. In one common telling of the story, a young cowherd by the name of Niulang (牛郎) came across a fairy girl bathing in a lake—a girl named Zhinü (織女). Fascinated by her beauty, and emboldened by his companion, an ox, he stole her clothes and waited by the side of the lake. When she came out looking for her clothes, Niulang swept her up and took her back home. In time, they were happily married with two children. But when the Goddess of Heaven found out that a fairy girl had married a mere mortal, she grew furious and sent Zhinü into the sky, where she became the bright star Vega, in the constellation of Lyra the Lyre. (Watercolor by Robin Street-Morris, 2007.)

When Niulang discovered that his wife had disappeared, he searched high and low for her, but was unable to find her. Eventually, the ox told Niulang that if he killed him and wore his hide, he would be able to ascend the heavens to find Zhinü. Niulang did as the ox suggested, and took his two children with him to find his wife, becoming as he did the star Altair. Find her he did, but the Goddess of Heaven, angered once more by Niulang's impertinence, drew a river of stars—the Milky Way—forever separating Niulang (the star Altair) from Zhinü. Their two children became Tarazed and Alshain, the two dimmer (but still bright) stars that flank Altair in the constellation of Aquila the Eagle. But apparently the Goddess of Heaven was not entirely heartless, for once a year, on the seventh night of the seventh month, she sends a bridge of magpies (鵲橋) to connect the two lovers, for just one evening. And so Seventh Night is associated with romance (and also, interestingly, with domestic skills).

The celestial setting for the entire tale can be found in the Summer Triangle, which is bounded by three stars: Altair, Vega, and Deneb (in the constellation of Cygnus the Swan, also known as the Northern Cross). The Summer Triangle can be found in the night sky throughout summer and autumn; at this time of year, it passes nearly overhead at about ten in the evening. (Photograph by Bill Rogers of the Sa-sa-na Loft Astronomical Society, 2009; click to enlarge.)


Wednesday, August 26, 2009

How Random is Random?

We all think that we know when something is random. But how random is random?

Part of the aim of mathematics is to unify concepts. It's what makes mathematics more than just a collection of ways to figure things out. As a side effect, though, mathematics definitions tend to be a bit counterintuitive. For example, I think we all know what the difference between a rectangle and a square is: A square has all four sides of equal length, and a rectangle doesn't.

Except that a mathematician says that squares are rectangles, because to a mathematician, it's inefficient and non-unifying to say that a rectangle is a four-sided figure with four right angles, except when all four sides have the same length. It makes more sense, from a mathematical perspective, to make squares a special case of rectangles.

So hopefully it won't come as too much of a surprise if I say that a completely deterministic process, such as flipping a coin that always comes up heads, is still considered a random process to mathematicians who study that sort of thing. So is a coin that comes up heads 90 percent of the time. Or 70 percent. Or—and maybe this is the surprise, now—50 percent. The cheat coin is simply a special case of a random process. To a mathematician, none of these processes is "more random" than the others. They just have different parameters.

What we think of as randomness, mathematicians call entropy. This is related to, but not the same thing as, the thermodynamic entropy that governs the direction of chemical reactions and is supposed to characterize the eventual fate of the universe. (Another post, another time, perhaps.) It turns out that this "information-theoretic" notion of entropy corresponds pretty well to what the rest of us call randomness. For those of you who are even the slightest bit curious, the definition of entropy for a flipped coin is

S = - ( pH lg pH + pT lg pT )

where pH and pT are the probabilities for heads and tails, respectively, and lg is logarithm to the base 2. For a 50-50 coin, the entropy is S = 1; for a completely deterministic coin (a two-headed one, for instance), the entropy is S = 0. For something in between—say, one that comes up heads 70 percent of the time—the entropy is something intermediate: in this case, S = 0.88 approximately.

So, all right, how entropic is a real coin? The answer is that it's probably less entropic—less random, that is—than you think it is, especially if you spin it. A paper by researchers from Stanford University and UC Santa Cruz (via Bruce Schneier, in turn via Coding the Wheel) has seven basic conclusions about coin flips:
  1. If the coin is tossed and caught, it has about a 51 percent chance of landing on the same face it was launched. (If it starts out as heads, for instance, there's a 51 percent chance it will end as heads.)
  2. If the coin is spun, rather than tossed, it can have a much larger than 50 percent chance of ending with the heavier side down. Spun coins can exhibit huge bias (some spun coins will fall tails up 80 percent of the time).
  3. If the coin is tossed and allowed to clatter to the floor, this probably adds randomness.
  4. If the coin is tossed and allowed to clatter to the floor where it spins, as will sometimes happen, the above spinning bias probably comes into play.
  5. A coin will land on its edge around 1 in 6000 throws.
  6. The same initial coin-flipping conditions produce the same coin flip result. That is, there's a certain amount of determinism to the coin flip.
  7. A more robust coin toss (more revolutions) decreases the bias.
Somewhat along the same lines, Ian Stewart, who for a while wrote a column on recreational mathematics for Scientific American, mentioned a study in one of his columns by an amateur mathematician (and professional journalist) named Robert Matthews. Matthews had watched a program in which the producers had asked people to toss buttered toast into the air, in a test of Murphy's Law as it applies to buttered toast. Somewhat to their surprise, the toast landed buttered side up about as often as it landed buttered side down.

Matthews decided that was not quite kosher. People, he thought, don't usually toss buttered toast into the air; they accidentally slide it off the plate or table. That ought to be taken into account when analyzing Murphy's Law of Buttered Toast. And when he did take it into account, he found something rather unusual. A process that you might have thought was fairly entropic turned out to be almost wholly deterministic, given some not-so-unusual assumptions about how fast the toast slides off the table. Unless you flick the toast off the table with significant speed, the buttered side lands face down almost all of the time. And it has nothing to do with the butter making that side heavier; it's that the rotation put on the toast as it creeps off the table is just enough to give it a half spin. Since the toast starts out buttered side up (one presumes), it ends up buttered side down. Stewart recommends that if you do see the toast beginning to slide off the table, and you can't catch it, to give it that fast flick, so that it isn't able to make a half flip, and lands buttered side up. You won't save the toast, unless you keep your floor fastidiously clean, but you might save yourself the mess of cleaning up the butter.

On the other hand, maybe there's another solution.

Friday, August 21, 2009

Lines for Fries (and Fry's)

Thoughts about (and while) waiting in line at the neighborhood McD's. Mmm...fries.

The Most Evil Being mocked the last queueing theory post, but he actually read the whole thing to mock it. Apparently, so did Squishy. I approve, of course. But Squishy noticed, with some temerity, that I had tagged that post with "queueing theory," indicating a potential for future posts about same. Well, the future is now. That post dealt with queueing theory itself as little as I could manage, which, OK, is still quite a bit, I guess. Fair warning: There's a bit more of it in this one.

If you were so foolhardy as to look in a queueing theory textbook, you'd probably see a representation of a queue as something like this:

The thing at the left is the queue, or waiting line; the colored blocks inside the queue represent customers; and the circle at the right is the server. Nowadays, in the computer world, we think of servers as big honking machines, but in general, it could be anything that provides a service. Say, an order taker at a fast food restaurant.

That diagram up there represents only one potential way to hook customers up with servers: a single queue with a single server. Lots of places, like McDonald's, or the supermarket, or the bank, have multiple servers available at a given time. How do they connect their customers to their servers? Here are two diagrams representing two options, without any explanation. Before reading on, see if you can figure out what queueing systems they represent, and how the lines you wait in everyday are arranged.

System (a), on the left, represents a queueing system in which each server (e.g., checkout clerk, bank teller, etc.) has his or her own line. System (b), on the right, represents a system in which all the servers together share a single line. Where I live, in Los Angeles, McDonald's and the supermarket use (a), and the bank and some other fast food places use (b). Your mileage may vary, of course.

OK, that wasn't too hard, probably. Now think about this one: All other things remaining equal, which system is better? And by better, I mean that it improves the time that you have to wait, on average, before getting service. We'll assume, to make things easier, that once you enter a queue or line, you stay in it; you don't give up, and you don't defect to another line. We'll also assume, furthermore, that all the servers are equally fast (or slow, depending on your point of view). Would you prefer to wait in (a), or (b)?

Before I answer that, let me first define some terms. All queueing systems have what's called an arrival rate, which is the rate, on average, at which new customers enter the queueing system. All servers have a service rate, which is the rate, on average, at which they can serve customers, assuming they have any customers to serve. One of the things I mentioned in that last queueing theory post was that a system is stable (that is, it doesn't jam up) if the arrival rate doesn't exceed the service rate. With me so far?

All right, one last term: The utilization of a server, or a group of servers, is the arrival rate divided by the service rate. So, pretty obviously, if the utilization of a server or servers is less than one, it's stable, and if it's greater than one, it's unstable—the line or lines get longer and longer. Somewhat less obviously, the utilization of a server is also the fraction of time that it spends actually serving customers, rather than sitting idle (which is why it's called the utilization in the first place).

Suppose Store A is using queueing system (a). It's got, let's say, six servers, each capable of serving one customer a minute. Customers come into the store at a rate of three customers a minute. Since each server gets one-sixth of all the customers, on average, each server's customer arrival rate is half a customer a minute, and each server's utilization is 1/2 divided by 1, or 1/2.

Store B, on the other hand, is using queueing system (b). It also has six servers, each of which also serve at one customer a minute. Because they all get fed from the same line, it's convenient to think of them as together serving customers at a rate of six customers per minute. If the arrival rate to Store B is the same, three customers per minute, then the utilization of the six servers, combined, is 3 divided by 6, or again, 1/2. So far, it seems like the two systems are pretty equivalent.

However, Store A has a problem that Store B doesn't. Consider the situation diagrammed below:

Five of the servers are busy, and they even have customers waiting in line behind them. The sixth server, however, is entirely idle, but because we've assumed that customers don't switch lines, it has nobody to serve. (Lest you think this is entirely unrealistic, I see it all the time at the supermarket, possibly because the idle server is a few counters away from the busy ones.) This is bound to happen from time to time, since the utilization is less than one. Servers are going to be idle every now and then, and if that happens when some other customers are waiting to be served, Store A is going to be inefficient at those times.

Note that this never happens to Store B. Certainly, servers are going to be idle from time to time, and customers are going to have to wait from time to time. But they never both happen at the same time. Any time a server comes idle, if there's any customer waiting for service, it can go straight to that server. As a result, Store B, and queueing system (b), is better for the customers: They wait for a shorter time, on average, than customers at Store A.

What's more, queueing system (b) is fairer, in the sense that customers that arrive first are served first. That doesn't always happen with queueing system (a). In the situation depicted above, if a customer now arrives to that sixth, idle server, it gets served immediately, without having to wait, even though customers that arrived previously to other lines are still waiting. So (b) is doubly better than (a).

In light of this, it shouldn't come as any surprise that Fry's Electronics, essentially the store for übergeeks, uses system (b) in every one of its stores I've been in. It even takes advantage of the longer single line (as opposed to an array of shorter lines) by snaking it between and amongst a panoply of impulse buys. One could argue that supermarkets can't really take proper advantage of system (b), because people usually have carts, and these take up a lot of room, which would obstruct other supermarket traffic. (I also haven't considered the effect of the 12-items-or-less express lanes.)

But a place like McDonald's has no such excuse. Even if you make the point that people switch lines when there's nobody waiting at a server (because the service counter is not so large), it's still unfair, in that it's not first-come-first-served. And other fast food places are perfectly willing to arrange a single line for all servers.