Tuesday, June 06, 2006

Lying with statistics

Mark Chu-Carroll from Good Math, Bad Math writes about Fundamentalists' dishonest use of statistics:

Woodmorappe wants to figure out how much space is needed [on Noah's Ark] by 16000 animals. How does he do that? Easy. He figures out an average amount of space needed by each animal, and multiplies it by the number of animals. Doesn't sound bad as a way of making an estimate, right?

Except... How does he figure out how much space is needed by each animal?

He figures out the median size of the animals in the ark; and determines that the median-sized animal would need a space of .5 by .5 by .3 meters - that is less than 1/10th of a cubic meter per animal. Multiply that by the number of animals, and you come up with 1200 cubic meters. Not terribly much space at all - which would mean that there would be plenty of room for food for the animals.

So what's wrong?

What's wrong is that the median is a dishonestly misleading figure to use.

In general, there are two major figures that can be used as an "average" value: the mean and the median. The median is the "middle" value, and is very good for estimating what a "typical" example of some collection is. For example, if you want a good idea of how much money a firm pays it's employees, the median is the best figure to use.

The mean is the "average" you probably learnt about it school: add up all the figures, and divide by the number of figures.

In this case, using the median to extrapolate to the total space needed on the Ark is dishonestly misleading. Here's a simplified example of why:

Suppose you've got to ship seven animals from one farm to another. How big a truck do you need?

You've got five rabbits, each about 15cm (6") tall, one sheep about 90cm (3') tall, and a cow 1.5m (4'11") high. The median is the middle value of 15, 15, 15, 15, 15, 90, 150 = 15cm, and is much more typical for our five animals than is the mean of 45cm.

To see why Woodmorappe's calculation is wrong, let's extrapolate from the "average" to the total height. That's what Woodmorappe is doing: he needs to know how big the Ark must be to hold all those animals. The right way to do it is to multiply the mean by the number of animals: 45*7 gives us 285cm, which is the total height of our seven animals. The wrong way, which Woodmorappe does, is to multiply the median: 15*7 which gives 105cm. That enormously underestimates the size need, and notice that the total height we calculate using Woodmorappe's method isn't even big enough for the cow.

Woodmorappe radically underestimates the size of the Ark. There are thousands of tiny animals, rat sized or smaller, which outnumber larger beasts like elephants not just in the numbers of individuals but in the number of distinct species (or, as the Fundamentalists prefer to say, "kinds"). Using the median, Woodmorappe extrapolates from the typical animal size (rat-sized) and effectively pretends that larger animals like apes, elephants, cows and lions take up no more room than a rat. Using the median in this way isn't a subtle error -- it is such an obvious error, such a strange thing to do, that it is hard to credit that it could be an innocent mistake.

(Good Math, Bad Math is moving to Science Blogs.)

No comments: