Tuesday, February 4, 2014

One Language, Under Force

I watched the Super Bowl.  Well, "watched" might be putting it a bit strongly.  I watched the first part, a very short part, in which the Broncos seemed as though they might have had a decent chance to win.  After that, I watched mostly to see what other parts could fall off the Denver bandwagon.  Congratulations to Seattle; they thoroughly outclassed their opponents.

That left the halftime show and the commercials.  I have to say that I didn't even watch those very assiduously, though I find the idea that they aren't as good as they used to be to be about a step or two shy of yelling at the neighborhood kids to get off the lawn.  It's Cranky Old Geezer time!

But even through my haze of disappointment in the football game, I did manage to get a look at the Coca-Cola "America the Beautiful" commercial.

The one-minute spot consists of a sequence of short video vignettes of the broad span of Americana, against which is sung "America the Beautiful."  There's nothing at all contentious about that, as far as it goes, of course.  What seems to have gotten lots of people in a lather is the fact that, except for the first and last phrase, the song is sung in several different languages.  No doubt Coca-Cola wanted to invoke the idea that part of what makes America beautiful is the wide variety of people that make it up, and that's what the commercial does.  In fact, Coca-Cola went so far as to follow the commercial up with a tweet, just in case someone missed the point:

Apparently, that's not the message that many people got.  I imagine the reaction of Coca-Cola to the some of the retweets ranged from bemused concern to horrified astonishment.  (Or maybe they're more cynical than that; it's quite plausible.)  I don't have the patience to drag them all out, obviously, so I'll just link to a collection of some of them here.

As might be expected, there's also been a backlash against those reactions, lambasting them as racist or ignorant or condescending, or who knows what.  I won't attempt to characterize them one way or the other; as I like to say, people feel what they feel, and it's pointless to tell them they're "wrong" to feel that way.  But I think it is interesting to try to suss out just why they feel that way.  What is it about diversity, in what seems like such a harmless context, that spooks some people?  Is our sense of national pride so fragile that it relies upon the exclusive use of a language that was brought forth onto this continent for the first time not half a millennium ago?  I have no idea whether one of the languages used in the commercial was an Amerind tongue (maybe someone can tell me), but I wonder what the reaction to that would be.

The melting-pot metaphor used to be a point of pride for us; it's a central point of one of those Schoolhouse Rock shorts, for those of you who remember those.  I don't remember anyone lashing back at those a few decades ago.  Shall we say to those who object to singing "America the Beautiful" in anything other than English that they are simply being too sensitive?

In connection with that possibility, let me introduce another commercial, which aired last year (and was brought back to mind by a friend of mine):

Notice anything out of the ordinary?  I have to admit that the first time I watched this, I didn't.  Then, my friend pointed out, "Look at who's in the box."  My reaction to this was, "Oh, PoCs in a box," since the forward-thinking hotel guest (who incidentally spouts some meaningless marketing mumbo-jumbo, but that's neither here nor there) is a white male, and all the persons of color, along with a white male or two (for variety I suppose), are in the box.  Even the guy who thinks to venture out before scurrying back to the safety of the box is a white male.  (I must say that the look of relief on the woman next to him is hilarious.  She should get a Cleo for that.)

A "natural" reaction by some people, in response to such a comment, might well be "Oh, you're being too sensitive.  They had to put someone outside the box; it just so happens it was a white guy."  In isolation, that point might be arguable.  However, it happens too frequently for it to be just random chance.  The vast majority of business travellers I work with are white males, and it's not surprising that they (the primary target of the commercial, after all) would prefer to see someone like themselves as the hero of the story.

I see too frequently, however, the objection that people of color have an inferiority complex, that they play the race card too readily, that they are too comfortable in the victim role.  Does it really make sense that a group of people who are actually empowered would feel that way, that instead of doing what they're capable of, they would rather lie down and cry foul?  I'm not sure that there's been a significant group of people like that in the history of ever.  Regardless of whether that group is a victim of discrimination as they claim, or for some constitutional reason is less capable, or both, it's utterly implausible to me that they would rather blame someone else than have more power.  Blame may be a salve for what ails them, but equal power is the cure.  A small number of them might miss that, but not the whole group.

Could something similar be at work with the reaction to the Coca-Cola commercial?  Is it that people feel upset about the commercial because it represents a situation they have little or no immediate control over?  Undoubtedly that's part of it.  After all, the tweets are rife with threats to boycott Coke, but these threats would have essentially no impact on Coke's bottom line even were they credible.  As it is, I suspect the vast majority of those would-be boycotters will be back to drinking Coke before the month is out.  Inexpensive habits can be terribly hard to break.  And at any rate, Coca-Cola is here serving only as a proxy for what some evidently see as a distressing trend toward inclusiveness.

Isn't it provocative, though, that each side sees a given cultural portrayal as betraying an awful truth, and often speaks out vigorously against it—something that the other side views as oversensitive and tiresome?  And this may be the crux of the matter: that there is a kind of massive joint cognitive dissonance between the way that the various groups perceive the current cultural situation, and the way that the various groups think the situation should be.  This dissonance is made all the more contentious by the striking symmetry between the views.

There is one thing, however, that distinguishes the two cases, as exemplified by these commercials, and that is the distinction between equality and uniformity, something that has stuck with me ever since it was first explained to me in stark simplicity in Madeleine L'Engle's A Wrinkle in Time.

http://upload.wikimedia.org/wikipedia/en/0/0b/WrinkleInTimePBA1.jpgWhat is at the root of the desire for uniformity (for I see no other way to describe, as succinctly, the demand that people sing this song in English) that the Coca-Cola tweets share?  It seems to me that it aims for a feeling of security, that if we only trust those people who cleave to the majority culture, then all will be well in this world gone mad.  But if that's so, is it necessary to demand uniformity?  Can't we feel secure without insisting on the elimination of the traces of other cultures?  Why not cut the middle man of uniformity out of the picture entirely?

I fear, though, that this is not likely until people see that this kind of uniformity not only isn't the end goal, but is actually counter-productive as far as any real kind of security is concerned.  I like to say that religion is a laser of the people, by which I mean that it moves people to behave and operate in unison, almost as though they constituted a single being, which can do certain things that the individuals couldn't do, separately.  But that same uniformity has a cost, because if all the individuals uniformly have a weakness, that weakness is passed onto the group as a whole, and is not amortized, so to speak.  I'm reminded of the old Aesop's fable in which an old man, near the end of his life, demonstrates to his sons the value of unity by tying together a bundle of sticks.  That bundle, of course, could not be broken by vigorous effort, even as the individual sticks were easily snapped.  It's ironic to think, though, that one could quickly slice through the bundle if one were to cut lengthwise.

The amortization of weaknesses is what makes diverse groups so robust.  It's why a farm made up of a single strain of high-yield crops is not a good long-term strategy.  It's why a diversified investment portfolio is safer than one that relies on a single kind of asset.  And why shouldn't the same kind of reasoning apply just as well to people as to crops or funds?  Yes, uniformity is good in moderate doses, for it enables feats that could not be achieved otherwise, but in doses large enough to dominate an entire country, it's dangerous.  It's dangerous not only because it makes the country more vulnerable, but also because it is such an appealing dogma.

Who knows if there will come a time when ads like Coca-Cola's will not produce such a strong negative reaction.  But if it does, it will be because people understand, viscerally, the value of diversity, and do not see it for the demise of national security.

Tuesday, December 17, 2013

The Travelling Santa Problem

This is a problem that briefly entertains me each year around this time, because it's mathematical and I'm me.

The question is, "How fast does Santa have to go to visit all those homes?"  We're not going to assume he has to go down chimneys or anything like that; he just has to get to all of the homes.  But assuming that Santa is real, he is still subject to the laws of physics.  No getting around those.

The Travelling Santa Problem bears a distinct resemblance to another classic problem of computation, the Travelling Salesman Problem.  A typical statement of this problem (made as non-gender-specific as I can manage) goes as follows: A sales rep must visit the capital cities of all 48 contiguous states, in whatever order desired.  What order minimizes the total travel distance?  It doesn't matter whether the sales rep drives or flies; what matters is that there is a definite and known distance between any pair of capitals.

For small numbers of capitals, this problem is trivial.  Consider three cities: Sacramento CA, Carson City NV, and Phoenix AZ.  The air distances between these cities are S-C = 160 km, C-P = 930 km, and P-S = 1016 km (I got these figures from the City Distance Calculator at http://www.geobytes.com/citydistance.htm).  The sales rep, in order to minimize the total travel distance, should avoid the long Phoenix-to-Sacramento leg, and visit the cities in the order Sacramento, Carson City, Phoenix (or the reverse).

Adding a fourth city does complicate matters somewhat.  The cost of adding one city is three new distances.  If we add, say, Boise ID, the new distances are S-B = 712 km, C-B = 580 km, and P-B = 1179 km.  And instead of only three essentially different routes, there are now twelve: B-C-P-S, B-C-S-P, B-P-C-S, B-P-S-C, B-S-C-P, B-S-P-C, C-B-P-S, C-B-S-P, C-P-B-S, C-S-B-P, P-B-C-S, and P-C-B-S.  (There are twelve other orders, for a total of 4! = 24, but the other twelve are the reverse of those already listed, and are the same for the purpose of total distance.)  By exhaustive calculation, we find that the minimal path is B-S-C-P (or P-C-S-B), with a total length of 712+160+580 = 1452 km.

One thing that becomes quickly apparent about this problem is that you can't solve it just by picking the three shortest distances, because those three distances may not connect all of the cities, or do so in a path.  Instead, in this case at least, we had to try all the different routes and pick the shortest overall route.  In fact, the Travelling Salesman Problem is a so-called NP-hard problem; this ties it with a number of other problems whose solution times are all expected to increase exponentially with the size of the problem, barring some unexpected theoretical advance.

In the case of the Travelling Santa Problem, however, we are not interested in knowing how Santa knows what order to visit the homes, or even what order he actually visits the homes.  We just need to know, to a rough order of magnitude, how far he must travel to visit every home.

Let us consider that the current population of the Earth is about seven billion.  How many homes is that (if by home we mean any single living unit)?  There are some homes with lots of people in them, living as a unit; on the other hand, there many homes with only one person in them.  We probably would not be too far off if we assume an average of two people per home.  That would mean 3.5 billion homes to visit.

Now, if these 3.5 billion homes were evenly distributed across the surface of the Earth, which has a surface area of about 510 million square kilometers, each home would have to itself an average of about 0.15 square kilometers, which means the mean home-to-home distance would be about 0.4 km, and Santa would have to travel about 3.5 billion times 0.4 km, or about 1.4 billion km.  If we assume that Santa has to travel all that way in a single day (86,400 seconds), that means he must travel about 16,000 km/s, a little over a twentieth of the speed of light.  So, very fast (about 35 million mph), but at least doable in principle.

In truth, it's a bit better than that.  In the first place, most of the Earth's surface is water; only about 30 percent of it is land.  Of that, the polar lands, especially around Antarctica, are not readily habitable in the usual way, so that perhaps only about 25 percent of the Earth's surface has any appreciable habitation.  That cuts the total distance in half to about 700 million km, and the necessary speed to 8,000 km/s.

Even better, human homes are not evenly distributed across the 25 percent of the Earth's surface they cover, but are clumped together in towns, villages, and cities large and small.  We might consider a clump to be any collection of homes that are within 100 meters of at least one other home.  This means, among other things, that a single isolated home is considered to be a clump.

It's hard to know the exact number of such clumps in the world, but perhaps we would not be too far off if we let the average clump size be 350 homes.  In that case, the total number of clumps would be 3.5 billion, divided by 350, or 10 million clumps.  The average clump-to-clump distance would then be about 7 km, and the total clump-to-clump travel distance would be 10 million times 7 km, or 70 million km.  To that would have to be added the home-to-home travel in each clump of 350 homes.  If each pair of homes is separated by 100 meters, and there are 350 homes, then each clump requires an additional 35 km, times 10 million clumps, or 350 million km, for a grand total of 420 million km.  That cuts the necessary speed to just under 5,000 km/s.

Of course, 100 meters is just the maximum cutoff distance between homes in a clump.  The average distance would be rather smaller.  In a major city like New York, for instance, the average distance is probably closer to 10 meters; in other areas, the average might be smaller than that.  In that case, the total clump internal distance for 350 homes would be more like 3 or 4 km, for a total distance of 100 million km, with a required speed of about 1,200 km/s.

Finally, statistical studies show that if N clumps are randomly distributed over an area of about A = 130 million square kilometers (as we've assumed here), the unevenness caused by that random distribution creates some clumping in the clumps, so that the total clump-to-clump travel distance is given approximately by  √(NA/2) = √(650 million million square kilometers) = 25 million km, lowering the total distance to 60 million km, and the speed to 700 km/s.

That's still about 1.6 million mph—fast enough to go around the Earth in a single minute—so perhaps Rudolph better get going.

Wednesday, May 15, 2013

A Pair of Potter Poems (Picked by Peter Piper?)

Here are a couple of poems.  Sonnets again.  The schtick here is that they concern a pair of Harry Potter characters.  It should be trivial to figure out who they are (although it might require a dictionary for our younger readers).

(I admit I felt compelled to put these here so that the three of you coming from Ash's poetry blog don't feel like you got put on a bus to Hoboken, Nerd Jersey.)


The boy stepped forth and took his place beneath
the brim.  A minute passed, now two, then three,
within which time the shades of bravery
and justice armed their forces to the teeth.
Though all saw brav'ry take the palm and wreath,
it lay in waiting, seeming idly:
At length, his courage glowed for one to see,
demure, as though he'd drawn it from its sheath.
It wavered, unaccustomed to the light;
it felt about, uncertain of its tread.
Till blunt necessity called out its right,
to cleave the foul ophidian at its head.
Oh say! where night left off and day began,
to slumber off a boy and wake a man.


He stands, a glower made inscrutable,
ambiguous.  He wreaths his honest thoughts
in coronets of random noise, in knots
of truths both blank and indisputable.
The swollen ranks, beneath his gaze, bear gloom.
Their dully thronging stride stamps out the time
left to his bitter charge, and neither rhyme
nor reason can forestall his chosen doom.
Though he may carp or cavil over weights
none else has will or wherewithal to bear,
that memory, besmirched, of onetime mates
does focus his poor genius in its glare.
So pity not the fool who plays the lie--
once! twice! now thrice!--to gamble and to die.

Copyright © 2011 Brian Tung

Tuesday, May 7, 2013

Why CPU Utilization is a Misleading Architectural Specification


Actually, this post only has a little to do with queueing theory.  But I can't help tagging it that way, just 'cause.

Once upon a time, before the Internet, before ARPANet, even before people were born who had never done homework without Google, computer systems were built.  These systems often needed to plow their way through enormous amounts of data (for that era) in a relatively short period, and they needed to be robust.  They could not break down or fall behind if, for instance, all of a sudden, there was a rush in which they had to work twice as fast for a while.

The companies that were under contract to build these systems were therefore compelled to build to a specified requirement.  This requirement often took a form something like, "Under typical conditions, the system shall not exceed 50 percent CPU utilization."  The purpose of this requirement was to ensure that if twice the load did come down the pike, the system would be able to handle itthat the system could handle twice the throughput that it experienced under a typical conditions, if it needed to.

One might reasonably ask, if the purpose was to ensure that the system could handle twice the load, why not just write the requirement in terms of throughput, using words something like, "The system shall be able to handle twice the throughput as in a typical load of work"?  Well, for one thing, CPU utilization is, in many situations, easier to measure on an ongoing basis.  If you've ever run the system monitor on your computer, you know how easy it is to track how hard your CPU is working, every second of every day.  Whereas, to test how much more throughput your system could handle, you'd actually have to measure how much work your CPU is doing, then run a test to see if it could do twice as much work without falling behind.  A requirement written in terms of CPU utilization would simply be easier to check.

For another thing, at the time these requirements were being written, CPU utilization was an effective proxy for throughput.  That is to say, in the single-core, single-unit, single-everything days, the computer could essentially be treated like a cap-screwing machine on an assembly line.  If your machine could screw caps onto jars in one second, but jars only came down the line every two seconds, then your cap-screwing machine had a utilization of 50 percent.  And, on the basis of that measurement, you knew that if there was a sudden burst of jars coming twice as fast—once per second—your machine could handle it without jars spilling all over the production room floor.

In other words, CPU utilization was quite a reasonable way to write requirements to spec out your system—once upon a time.

Since those days, computer systems have undergone significant evolution, so that we now have computers with multiple CPUs, CPUs with multiple cores, cores with multi-threading/hyper-threading.  These developments have clouded the once tidy relationship between CPU utilization and throughput.

Without getting too deep into the technical details, let me give you a flavor of how the relationship can be obscured.  Suppose you have a machine with a single CPU, consisting of two cores.  The machine runs just one single-threaded task.  Because this task has only one thread, it can only run in one core at a time; it cannot split itself to work on both cores at the same time.

Suppose that this task is running so hard that it uses up just exactly all of the one core it is able to use.  Very clearly, if the task is suddenly required to work twice as hard, it will not be able to do so.  The core it is using is already working 100 percent of the time, and the task will fall behind.  All the while, of course, the second core is sitting there idly, with nothing to do except count the clock cycles.

But what does the CPU report is its utilization?  Why, it's 50 percent!  After all, on average, its cores are being used half the time.  The fact that one of them is being used all of the time, and the other is being used none of the time, is completely concealed by the aggregate measurement.  Things look just fine, even though the task is running at maximum throughput.

In the meantime, while all of these developments were occurring, what was happening with the requirements?  Essentially nothing.  You might expect that at some point, people would latch onto the fact that computing advances were going to affect this once-firm relationship between CPU utilization (the thing they could easily measure) and throughput (the thing that they really wanted).

The problem is that requirements-writing is mind-numbing drudge work, and people will take any reasonable measure to minimize the numbness and the drudge.  Well, one such reasonable measure was to see what the previous system had done for its requirements.  What's more, those responsible for creating the requirements were, in many cases, not computer experts themselves, so unless the requirements were obviously wrong (which these were not), the inclination was to duplicate them.  That would explain the propagation of the old requirement down to newer systems.

At any rate, whatever the explanation, the upshot is that there is often an ever-diverging disconnect between the requirement and the property the system is supposed to have.  There are a number of ways to address that, to incrementally improve how well CPU utilization tracks throughput.  There are tools that measure per-core utilization, for instance.  And even though hyper-threading can also obscure the relationship, it can be turned off for the purposes of a test (although this then systematically underestimates capacity).  And so on.

But all this is beside the point, which is that CPU utilization is not the actual property one cares about.  What one cares about is throughput (and, on larger time scales, scalability).  And although one does not measure maximum throughput capacity on an ongoing basis, one can measure it each time the system is reconfigured.  And one can measure what the current throughput is.  And if the typical throughput is less than half of the maximum throughput—why, that is exactly what you want to know.  It isn't rocket science (although, to be sure, it may be put in service of rocket science).

<queueingtheory>And you may also want to know that the throughput is being achieved without concomitantly high latency.  This is a consideration of increasing importance as the task's load becomes ever more unpredictable.  Yet another reason why CPU utilization can be misleading.</queueingtheory>

Sunday, April 28, 2013

The Strange Existence (and Subsequent Non-Existence) of Albert Cribbage

(a précis)

This is not a poem, not even a prose poem.  But it shares enough quirkiness with poems I enjoy for me to find it in the spirit of National Poetry Month, so in it goes.

I found this while I was going over some of my old writing projects.  I say "projects"; these were not for any kind of organized course or anything.  I wrote (as I still do) whenever I have a bit of idle time and am able to cobble together thoughts in any particular direction.  This one struck my fancy, and the précis tag was intended to remind me to extend it into a more protracted argument, which (of course) never happened.  Other ideas distracted, and continue to distract.

Anyway, without further ado, we present:

The Strange Existence (and Subsequent Non-Existence) of Albert Cribbage

Albert Cribbage had his dream, and he spent much of his life constructing her. Blessed with the world's longest serial lucid dream, he manufactured a perfect Woman over a period of years, taking as inspiration pictures from fashion magazines (the late 80s, which he preferred, much to the disgust of his "hipper'' friends), television commercials for beauty products, and several of the mainstream literary journals. He did, after all, want her to be well read.

At last he had completed her; all that remained to bring her to life (insofar as that was possible for her) was the Kiss. He went out and purchased fine satin sheets, and a royal purple bed cover set (limned in gold cord, of course). He settled into bed, and tried to go to sleep, with great difficulty, as he had never in his life been so excited.

At length, he managed to doze off. His dream began, as planned, with him approaching his soon-to-be-loved in his bed. He leaned down, as in the fairy tales of yore, and touched his dry, trembling lips to her still, perfect ones. Instantly, she opened her eyes, and it was as if they had been waiting for each other for all of eternity. She allowed herself to be swept up even further in the kiss, and he was soon with her, under the covers.

They made love, passionately, in the semi-darkness (where all dreams are; the well-lit ones are simply optical illusions in mid-slumber), and after several exhausting but very satisfying hours, their legs became entwined as they enjoyed the smooth sleep of afterglow.

In the morning she awoke, and the memory of the past night had left a smile on her face. But, she mused, he was still not quite perfect (italics hers), and once the morning niceties (a warm shower, a generous breakfast) were done, she set out for the newsstand, where she thought the latest monthlies from Paris might just give her the ideas she needed to create a new dream lover, one she would want to keep for good...

Copyright © 1996 Brian Tung

Wednesday, April 24, 2013

When We Flew

[Another Facebook post cannibalized for National Poetry Month.  At the time I wrote this, Kobe had not yet suffered his season-ending Achilles injury.]

I was watching yet another YouTube clip of Kobe wowing us with his athleticism and wizardry, and I started thinking about how many of the highlights were in another century. Hard as it may seem to believe at the moment, there will come a day when Kobe will no longer be able to dunk. It might not come this decadehell, if MJ is any indication, it might not come the next, eitherbut it will come.

Anyway, I started getting a bit depressed about that, and so as if to bring myself out of that funk, I started scribbling some lines. And I found that it actually sort of helped, a little. I hasten to emphasize that all this has nothing at all to do with the fact that a birthday is coming up, or anything like that. That is so a coincidence.

It may read as though it's about other things, and it can be. But I really did write it with basketball in mind.

when we flew

When we flew,
we made legends.
We startled and we stunned,
and foes grasped at us in vain.
Our wings would never tire,
and our lungs never fail.
The world lived a thousand times
and never knew how close it had come,
and all because we flew
     when we flew.

When we flew,
time stood to watch,
then travelled back to watch again,
hardly daring to believe.
Space cleared space for us,
and light held us in her gaze.
The stars shone their mute fanfare
shattering their crystal spheres,
and all because we flew
     when we flew.

Now we stand,
make way while children soar.
We wear our pride like envy,
and dress our unease in longing.
We envision battles we will never fight,
and so we shall never lose.
A thousand times we'll close our eyes and ears
and sip champagne from glass slippers,
and all because we flew
     when we flew.

Copyright © 2012 Brian Tung

The Wolfpack and the Lone Wolves

So a friend of mine posted a link to this story, and because it involves game theory (even though it wasn't actually a game theory course) and I do, in fact, work like that, I immediately started thinking about a way to analyze it.  Actually, I think thirty students is way too many to get some of the more interesting interactions going; the wolfpack is almost certainly the way to go, especially if you're not one of the brighter bulbs.  Thirty is probably too many to deal with analytically anyway.  So let's start with three.

Suppose the three students A, B, and C have (possibly) different aptitudes, represented by a, b, and c, respectively.  These three numbers represent the probability with which each of the students answers questions correctly.  (We'll assume that questions have two answers, one right and one wrong.)  Without loss of generality, let's say that a b c.  Under which conditions will two or more of these students collude?  Without explicitly prescribing a curve, let us say that the aim of any of the students is to improve their own grade; specifically, there is no benefit to philanthropy.

We can fairly quickly conclude that two students will not collude.  Consider A and B, and suppose first that a > b, and that A and B know that (that A is a better answerer than B).  A and B compare answers.  If they coincide, then of course they both answer that way, but if they differ, they'll choose A's answer (since it's more likely to be correct than B's).  But if that's the case, then they'll both answer correctly only if A already had the right answer.  That is to say, both A and B will answer correctly with probability a.  Well, there's no reason for A to collude with B, since it helps B without helping A.

The situation is not helped even if a = b, since the only difference is that some other means must be used for breaking the tie.  No matter how the tie is broken, the answer that is chosen cannot have a greater probability of being correct than a = b, so there is no benefit to collusion for either A or B.

A similar line of reasoning applies to any other pair of students.  Well, then how about all three students colluding?  That will only happen if all three students are benefited, and A, with the highest aptitude, is the standard here.  Let's consider how A's answer would be affected by the collusion.  The first way is that A's initially correct answer would be made incorrect by collusion.  That happens if A would have answered correctly, but B and C would not.  That happens with probability a (1 - b) (1 - c).

The second way to affect A's answer is to change an initially incorrect answer into a correct one.  That happens with probability (1 - a) b c.  So, on balance, A has an incentive to collude (and therefore all students do) if

(1 - a) b c > a (1 - b) (1 - c)

For instance, if the three students respectively have 90, 80, and 70 percent probabilities of answering questions correctly, then we have

(0.1) (0.8) (0.7) = 0.056 > 0.054 = (0.9) (0.2) (0.3)

and it makes sense for all three to collude, by this metric.

Why by this metric?  What other metric could there be?  Suppose we now introduce an explicit curve: The students receive, as their final grade, not their actual raw score, but a ranking-scaled score.  The top raw score earns three points, the second best raw score earns two points, and the lowest raw score earns one point.  Two students tying at the top both earn 2.5 points, while two students tying at the bottom earn 1.5 points, and finally if all three students tie, they all earn two points.

Under these conditions, the three students will not all collude.  A, as the best student, is the most likely of the three to earn three points, and the more questions there are, the more certain that is.  If A, B, and C all collude, they will all three earn two points (since their answers will be identical).  So chuck out three-way collusion.

But two-way collusion is now even less likely than before.  As we observed, it only improves the accuracy of the inferior student.  Before, that at least did not hurt the superior student, but now it improves the inferior student's scaled score at the expense of the superior student's scaled score.  So two-way collusion is out, too.

Shall we move on to four students?  I'll save that for a later post.