Saturday, May 2, 2009

Information Theory and Biological Evolution

The question of the origin of species, which Darwin was concerned with, or the origin of the first life form can be more broadly stated as the question of the origin of biological information. Whether one is interested in the origin of the first protein or in the development of new structures such as an eye or a wing, a successful explanation of these requires an accounting for an increase in biological information (e.g. DNA, RNA, or proteins). Darwin's theory of natural selection and mutation became famous because it proposed a natural explanation for such increases in information (though Darwin of course would not have phrased it quite that way), which previously had been widely regarded as requiring an intelligent agent (which would have been God, for most).

Darwin's combination of chance and natural selection was crucial; chance by itself (random mutations in the genetic code or random conglomerations of organic chemicals) is a hopeless approach to constructing even the simplest biological molecule. Take a small protein which consists of a chain of around 100 amino acids; the chance for a small functional protein to form in a random search is around 1 in 1270000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000. This is well beyond any reasonable probability limit, as would be the probability for any other biological feature (e.g., a strand of DNA). But, with natural selection preserving the individual random changes that are beneficial, larger changes can gradually accumulate. Richard Dawkins illustrates this with the following scenario: take the phrase "METHINKS IT IS LIKE A WEASEL." Imagine a computer randomly generating phrases out of the 26 letters of the English alphabet; the probability of producing this phrase through random letter generation is practically zero. However, if whenever a letter that is part of the phrase appears, the computer keeps that letter, then gradually (in fact, quite easily) the target phrase will be generated. In the biological world, natural selection and mutation achieve something similar: natural selection preserves the beneficial random mutations, resulting in new biological information.

But can information really be gotten so easily? It turns out that Dawkin's illustration is fatally flawed. Consider: the computer had a target phrase, and preserved incremental changes by comparing the randomly generated letters to this target phrase. In nature there can be no targets; natural selection and random mutation do not have a goal in mind. In nature, only beneficial changes can be preserved, not changes that will be beneficial in the future. An intelligent agent (Dawkins, in his illustration) may know that the gibberish is gradually turning into something that makes sense (METHINKS IT IS LIKE A WEASEL), but blind natural forces cannot possibly be shooting for such a target. So it turns out that the illustration meant to show how blind natural forces can generate information actually contains the target information ahead of time, put there by an intelligent agent.

William Dembski and Robert Marks II are working on what they are calling the Law of Conservation of Information. Essentially, it states that you cannot get more information out of a computational algorithm than you put in initially. If this is true, nonteleological evolution cannot in principle explain the origin of new biological information. See their paper here: http://www.uncommondescent.com/evolution/life%E2%80%99s-conservation-law-why-darwinian-evolution-cannot-create-biological-information/