Showing posts with label maths. Show all posts
Showing posts with label maths. Show all posts

Wednesday, November 29, 2006

Random number generators

"Anyone who uses arithmetic methods to produce random numbers is in a state of sin."
-- John von Neumann.

Paradise Poker has an interesting page on how they generate random numbers for their Internet casinos, and why most random number generators can't shuffle even a small list of items properly. For example, a standard deck of cards can be shuffled 52! different ways (more than 1067, or ten thousand billion billion billion billion billion billion billion). A 32-bit random number generator can generate at most four billion combinations -- clearly inadequate.

Friday, November 03, 2006

I'm no statistician

I'm no statistician, but I'm pretty sure this is correct:

pacman chart

(Click image for larger view.)

Shamelessly stolen from here.

Monday, August 07, 2006

Continued fractions

Mark C. Chu-Carroll from Good Math Bad Math has a nice explanation of continued fractions, what they are and why they are useful.

I like continued fractions. They're neat and full of wonderful properties. They are the basis of a lovely little algorithm for converting arbitrary decimal numbers into fractions. There are some great patterns in continued fractions too, like the square root of two. sqrt(2) can't be written exactly as a simple fraction, and the decimal expansion is messy: 1.414213... it goes on for ever, and despite the first couple of 14s there's no pattern to it.

But write it as a continued fraction and you'll see a wonderful pattern:

1 + 1/(2+ 1/(2+ 1/(2 + 1/(2 + ... )))

which can be written as [1; 2, 2, 2, 2, ...].

You want the reciprocal of sqrt(2)? That's easy: [0; 1, 2, 2, 2, 2, ...]. Or, expanding it:

0 + 1/(1 + 1/(2+ 1/(2+ 1/(2 + 1/(2 + ... )))

The rule for forming the reciprocal of a continued fraction is simple. If the whole number part (the first number, before the semicolon) is 0, you remove it and shift all the numbers one to the left. If it isn't zero, you shift all the numbers to the right by one, and stick a 0 as the whole number part.

Saturday, July 29, 2006

Escher's logarithmic transform

In 1956, M.C. Escher completed his drawing "Print Gallery" in 1956. It shows a young man looking at a print in a gallery that is strongly deformed, but with an underlying sense of order to the deformation. It took until 2003 for mathematicians to discover the structure of the mathematical transformation which Escher came up with intuitively.

Wednesday, July 26, 2006

You can't eat a bag of grapes one grape at a time

Good Math, Bad Math has a good article debunking a professional mathematician prostituting his craft for Creationism with pathetically bad faux-mathematical arguments against evolution:

The article claims that there are two arguments from mathematics that disprove evolution. Both are cheap rehashes of old creationist canards, so I won't go into much depth. But it's particularly appalling to see someone using trash like this with the claim that it's a valid mathematical argument.
[...] this asserts that it's impossible to move a large finite distance by taking small finite steps. This is allegedly a mathematician making this argument - but that's what he's claiming: that it's impossible for any large change to occur as a result of a large number of small changes.

So, according to this Creationist, erosion can't happen. You can't empty a 10kg sack of sugar one teaspoon at a time, let alone one grain at a time. There's no way to paint a room with a paint brush, one stroke at a time. And just forget about an abandoned swimming pool evaporating, one molecule of water at a time... small changes just can't add up to large effects.

That's Creation so-called "science" for you.

Monday, July 24, 2006

How statistics defeated Hitler and won the war

The Guardian is running a short article on how statisticians helped win World War Two for the Allies.

Here is a story about mathematical deduction that I love, mainly because it is said to be true, and because it had an impact (albeit small) on the outcome of the second world war. It is the story of how a simple statistical formula successfully estimated the number of tanks the enemy was producing, at a time when this could not be directly observed by the allied spy network.

I'd question the "albeit small" qualification -- had the Allies not trusted the statisticians' estimates, they could easily have delayed the invasion of Europe. Who knows what effect that would have had? Soviet tanks on the French border?

What is impressive about the maths is that the statisticians were virtually exactly correct in their estimate: they estimated that Germany was producing 246 Panzer Mark IV and V tanks per month, compared to the estimates by intelligence forces of 1,400. After the war, captured German production records showed that the actual number of tanks produced was 245 per month.

But what is really impressive is that the Guardian hasn't dumbed the article down to the point of uselessness: they actually give the statisticians' formula, and use the correct terminology:

Given a sample size S and maximum serial number M, a good estimator of the number of tanks made would be (M-1)(S+1)/S.

Now, all we need is to see how they derived that formula, and my day will be complete.

Thursday, June 29, 2006

From the land where Up is Down

War is Peace and Slavery is Freedom.

Tim Lambert at Deltoid writes about the mud being laid on with a trowel to the doco-movie "An Inconvenient Truth".

How is An Inconvenient Truth doing at the box office? Pretty well. The gross takings have increased every weekend and have almost reached $10,000,000. It's already the number 7 on the all time box office list for documentaries.

But that's not how it's being spun by the wingnuts:

UPI reports that Al Gore's movie, An Inconvenient Truth, hasn't done so well after a promising start:

Former U.S. vice-President Al Gore's documentary "An Inconvenient Truth" has seen its ticket sales plummet after a promising start.
[...]
The film dropped from its record $70,333 per play to $12,334 during its third week and its numbers have continued to fall as the film opens in smaller cities and suburbs across the country.

This is a text-book example of lying with statistics. As the movie is shown in smaller cities and suburbs, the average take per session falls -- but the total take increases. Strangely enough, UPI and Variety report that increased take as plummetting downwards. How curious. The right-wing media would lie about something? Say it ain't so!

Deltoid shows an impressive graph, showing clear growth, here. As the graph shows, the film is not only still making money, but the amount of money they make each week is continuing to grow.

Of course the wingnuts don't want you to know that, so what to do, what to do? There has never been a statistic that can't be distorted, so they focus on the per-session average, ignoring the fact that there are a lot more sessions. Of course, that means pretending that Up actually means Down, but hey, so long as the wingnuts continue to drink that Kool-Aid, it's all good.

Update:
Good Math, Bad Math has a good analysis of the dodgy statistics being used by UPI.

[...]when it was first released, it was being shown in a small number of showings in a small number of theaters. When it was premiered in 4 theaters, they sold out standing room only - so the gross per showing was very high. Now, four weeks later, it's showing in over 500 theaters, and the individual showings aren't selling out anymore. But more people are seeing it - every weekend, the number of people seeing it has increased!

The Powerline article (and the UPI article which it cites) are playing games with numbers to skew the results. They want to say that Al Gore's movie is tanking in the theaters, so they pick a bizzare statistic to support that, even though it's highly misleading. In fact, it's one of the best performing documentaries ever. It's currently the number seven grossing documentary of all time, and it's about $600,000 off from becoming number 5.
(Emphasis in original.)

Saturday, June 17, 2006

Bad maths about Irreducable Complexity

Mark C. Chu-Carroll over at Good Math Bad Math takes a philosophical position and attempts to use Godel's Theorem to prove that irreducible complexity is a meaningless concept:

No matter what you do - no matter what kind of formal system you've developed for showing that something is minimal, you're screwed. Godel just came in and threw a wrench into the works. There is absolutely no way that you can show that any system is minimal - the idea of doing it is intrinsically contradictory.


In summary: Mark thinks that Creationists need a guaranteed, perfect, 100% accurate irreducible complexity detector in order to do good science, and tries to demonstrate that there is no such beast. But Creationists don't need one. All they need is a detector that works at least once.

The Creationist concept of irreducible complexity (IC) as proof against evolution is irredeemably flawed. There is no shortage of websites that demonstrate the feebleness of IC as an anti-evolution argument, so I'll limit myself to a single point:


Mark sets his sights very high: he wants a knock-out blow, proof that not only can there be no examples of IC, but that the very concept is meaningless. Shorn of its mathematics, it can be summed up thusly:

  1. Proving that a biological system is irreducibly complex is mathematically equivalent to proving that it is minimal in the number-theoretic sense;

  2. Godel's Theorem holds for that biological system;

  3. Therefore proving that some biological system is minimal is impossible;

  4. Therefore we can't ever prove that a system is irreducibly complex;

  5. And therefore the concept of IC is scientifically meaningless.

Unfortunately, Mark gives no evidence for his very first premise. The concept of minimality in the number-theoretic sense has a specific meaning, and it isn't clear at all that irreducible complexity is equivalent to that specific meaning (mostly because it isn't clear what IC actually is). But, for the sake of the argument, let's assume point (1) is correct.

Point (2) is even more problematic. Mark has fallen for the error of greedy reductionism. Reductionism is a powerful tool, in biology no less than any science. But like all tools, it can be misused, and greedy reductionism is such an error: Mark ignores developmental biology, and consequently imagines that there must be a 1:1 correspondence between the genotype of an organism (the DNA) and its phenotype (the actual organs and molecular protein machines). Because DNA is Turing Complete, he imagines that Godel's Theorems must also apply to higher-order biological features -- the sorts of molecular machinery, the lipids and proteins and enzymes, that we must look at when trying to find irreducible complexity.

In a reply to a comment, he writes:

But per Godel, no axiomatic system is complete and consistent. If you have an incomplete axiomatic system, then the axiomatic system isn't powerful enough to do minimality proofs. If the axiomatic system is powerful enough to do minimality proofs (i.e., it's complete), then it's by definition inconsistent.

But this is just not so. Mark skips over a requirement for Godel's Theorems to hold. The most important is that the axiomatic system be computably enumerable, that is, there must be a way to enumerate all the statements ("true or false theories") of that system.

Are biological systems computably enumerate? Genes are comfortably digital, and thus any finite number of genes will be computably enumerate. A gene is a gene is a gene, regardless of the organism. But a gene's phenotype, it's effect, is not necessarily digital, it is often analog, and that means continuous rather than discrete.

If we look for irreducible complexity, we won't be looking for it in the genotype of the organism's DNA, we'll be looking in the phenotype: in the proteins, lipids and enzymes in 3D space, and the interactions between them. It is not at all clear that phenotypes are computably enumerate, even at the molecular level. At the very least, since molecules exist in a three dimensional continuum space, there is an uncountably infinite number of statements we could make about the relative positions of any two atoms. Adding the axioms of chemistry -- or physics -- will not help him, since we are rapidly getting into areas where we don't know enough to enumerate those axioms, let alone all the possible true and false statements that follow from them. (Given the laws of quantum mechanics, derive the solubility in water of all possible molecules containing 100 carbon atoms. Quickly now, we don't have all day.)

Mark's argument falls down badly at point (3). He writes:

This is a result of some work done by Greg Chaitin in Algorithmic Complexity Theory. A fairly nifty version of this can be found on Greg's page.

The fundamental result is: given a system S, you cannot in general show that there is no smaller/simpler system that performs the same task as S.

But Mark has missed an important part of what Chaitin says, and that makes a vast difference to his conclusion. Chaitin writes:

The first of these theorems states that an N-bit formal axiomatic system cannot enable one to exhibit any specific object with program-size complexity greater than N + c.

and furthermore, writes here:

Show that a formal system of lisp complexity H_lisp (FAS) = N cannot enable us to exhibit an elegant S-expression of size greater than N + 410. An elegant lisp expression is one with the property that no smaller S-expression has the same value. Setting: formal axiomatic system is never-ending lisp expression that displays elegant S-expressions.

What Chaitin has demonstrated is this:

Given a formal axiomatic system of some given amount of complexity, there exists sufficiently more complex systems within our formal system which we cannot prove if they are minimal or not.

For the programming language Lisp, Chaitin claims to have proven that "sufficiently more complex" means a mere 410 bytes over and above the complexity of Lisp itself. For DNA, we have no idea what sufficiently more complex may be; for the phenotype of even a simple organism, we can't even begin to guess. Hypothetically, it could be billions upon billions of gigabytes. Mark merely assumes that, since there are sufficiently large systems which can't be proven minimal, no systems can be proven minimal.

Since there are animals, like elephants, which are too big for me to lift, all animals must be too big for me to lift.

Or, to put it another way, Mark's argument is this:

There are hypothetical biological systems so complex that we can't prove, in the mathematical sense, that they are irreducibly complex; therefore the entire concept of irreducible complexity is meaningless.

When you strip out all the mathematics and formal language, the error is obvious. What about simpler biological systems? Can we not prove they are minimal? Maybe not -- but Mark hasn't ruled them out, and so his entire argument against IC is shot down in flames.

Interestingly, Mark himself recognises the existence of this hole in his argument, but just waves his hands and hopes it will go away. In this reply to a comment, he says:

If you look at the proof, there is a critical threshold, based on the size of the formal axiomatic system (FAS) that proves minimality. The problem with minimality kicks in as soon as the complexity of the system becomes such that the length of the meta-program plus the size of the FAS is smaller than the information-theoretic complexity of the system.

Tiny systems - like a single-instruction program - are probably smaller than that threshold. (I'll explain the probably bit in a moment.) But we can't know where that threshold is: because finding the threshold requires finding the minimum size of a FAS that can be used to show mimimality. And that's the minimality problem biting itself.

(Emphasis in original.)

Talk about Bad Math! Mark's argument, then, is effectively:

  1. For all X < N, where N is some unknown but positive non-zero number, IC is undecidable.

  2. Therefore IC is undecidable for all X.

Hmmm.

But it gets worse for Mark. He states that we can't know where that threshold is, and even emphasises it -- but if you read Chaitin's page that Mark linked to, you will see Chaitin has written:

As is shown in this course, the algorithms considered in the proofs of these two theorems are now easy to program and run, and by looking at the size in bits of these programs one can actually, for the first time, determine exact values for the constants c and c'.

Chaitin says he can calculate the threshold, and does. This wasn't hidden deep within a 300 page tome; it was the fifth paragraph, and Mark apparently missed -- or ignored -- it.

So far Mark's argument is not looking healthy: the first four points out of his five are either unproven, unjustified or wrong. The fifth point, his final conclusion, is even more flawed. He's confused undecidability for meaninglessness. Chaitin tells us that, given a sufficiently large system, we can't prove whether it is minimal or not; from this, Mark argues that therefore the very concept of being minimal is nonsense. He writes, replying to a comment:

The fact remains that no test can ever demonstrate that a real system is IC. A scientist can't test to see if a hypothesis of IC holds up. It's a non-testable hypothesis, because the underlying concept is invalid.

(Italics in original -- bold added.)

Mark hasn't demonstrated that IC is invalid. All he has shown is that some examples of IC are unprovable, not that there are no provable examples at all.

Mark falls for the fallacy of assuming that just because a statement of a formal system is unprovable, it must be invalid. But that's nonsense: unprovable (or undecidable) doesn't mean false, or meaningless. To take an non-mathematical example, it is undecidable whether Alexander the Great's maternal grand-mother ate a bowl of fried potato chips on her 10th birthday; but it is certainly either true or false that she did. That logical undecidability doesn't prevent us making a probabilistic decision that, beyond all reasonable doubt, she did not: potatoes weren't introduced into Europe (as far as we know) until over a thousand years later.

A more mathematical example of an undecidable statement is Goodstein's Theorem.

Despite all the holes in Mark's reasoning, let's grant him his conclusion: IC is logically undecidable. What does this mean for the science of biology?

Very little. All it means is that IC is unprovable in the mathematical sense, not unprovable in the scientific sense.

Science has much lower standards than mathematics: proof beyond all reasonable doubt, not beyond all doubt. All truths are provisional in science. In mathematics, we can be absolutely 100% certain that (e.g.) 11 is a prime number -- but in science, the best we can say about any fact is that it is true to the best of our knowledge, subject to revision in the light of new facts.

That "best" might be very good indeed. You can bet your life on many things in science, but it is never 100% certain. Scientists sometimes produce formal proofs based on simplified models of whatever feature they are interested in, but the model is not reality. Mathematical models are always simpler than reality. Consequently, science is ultimately based on empirical evidence. And what empirical evidence can prove, better empirical evidence can disprove.

(The classic example is Newtonian and Relativistic Mechanics; astronomical observations of the planets were Newtonian physics' greatest triumph -- until more accurate observations discovered discrepancies which couldn't be successfully explained by Newtonian physics.)

It is unfair to hold Creationists to greater standards than we hold real scientists doing science. Let's accept, for the sake of the argument, that Mark is correct: we can't find absolute certain mathematical proof that a specific biological system is minimal and therefore IC. Fine; but that's irrelevant to the question of whether we can be provisionally sure that it is minimal. On the basis of empirical research, Fermat's Last Theorem was proven beyond all reasonable doubt many years before certain mathematical proof was discovered. Mathematicians don't care for empirical evidence, but that's the fundamental core of science.

Let's pretend, for the sake of the argument, that Creationists like Behe who work with IC are doing real science. (Try not to laugh too hard.) They are working in a research program aimed at collecting enough evidence to disprove the fact of evolution (except our hypothetical Creationist scientists will disagree that it is a fact). To do so, they aim to prove that some sufficiently large number of biological features are irreducibly complex. For them, one such IC system is sufficiently large; for Richard Dawkins, it may require hundreds of examples before he'll admit that evolution can not explain all the facts of the biological world.

If the Creationists are lucky, they will find some examples of IC that are simple enough that they can formally prove they are minimal, and we are done. Mark's argument is shot down in flames, and biologists start looking for a new paradigm to explain the features of the biological world.

If the Creationists aren't so lucky, they will find biological features that they think are IC, but can't formally prove that they are absolutely minimal. Will that matter? No, of course not, any more than biologists let the fact that they don't have a formal axiomatic system for the mechanics of predation prevent them from accepting as true the empirical rule that, in general, predators will catch more slow, weak prey and fewer fast, strong prey. Biologists will try to falsify the IC hypothesis. If there are enough failures (whatever "enough" means), biologists will become convinced that, on the balance of the evidence, that these are actual IC systems. The lack of formal proof will disturb them no more than the lack of formal proof that the sun is mostly made of hydrogen disturbs physicists. Empirical evidence is enough.

In conclusion, Mark has:

  • Assumed that Godel's Theorem's apply to biological phenotypes, an unjustified assumption;

  • Tried, but failed, to demonstrate that Godel's Theorems forbid us from proving the existence of irreducibly complex features; and

  • Failed to demonstrate that it would matter even if proof of IC was impossible.

There are many reasons to consider irreducible complexity to be a degenerate research program, but Godel's Theorems are not one of them.