Information, DNA and evolution.

Many of those who are interested in the subject of evolution and life point out that the genetic code is a tremendous carrier of information, and often raise the question of where that information comes from. 

Intuitively we know that there is more information in, say, a recipe for baking a cake than in the statement that “it is raining”, but it is difficult to intuitively define or quantify information.

Information is something that can be transmitted from ‘A’ to ‘B’ (through space or time or whatever) which gives ‘B’ the ability to know something that they didn’t know before.  Information conveys some meaning.

By itself a steady white light shone from A to B can only convey a tiny bit of information, perhaps only that “there is a light at A”.  And if the light flashes once per second it might convey that “there is a light at A that flashes once per second”.  With the addition of a decoder, say a lighthouse signal book, a regularly flashing light might convey the information that “that is Portland Bill lighthouse”.  However, in that case the additional information that “Portland Bill light transmits such and such a sequence of flashes” has already been transmitted, and so perhaps it’s more accurate to say that the flashing light ‘activates’ the previously transmitted information, or that the previously transmitted information ‘decodes’ the information in the flashing light.

Often information needs to be transmitted from A to B in a secure way so that ‘C’ and ‘D’ cannot understand it.  i.e the information from A cannot be activated by anyone other than B.  The goal is to make the information without the decoder or ‘key’ indecipherable.  In that case, what might seem to be a string of random letters does actually contain a vast amount of information.  Yet without the decoder, the highly informative signal and the string of random letters look very similar; in practice both signals have the same potential to carry information.  Consider the following strings of letters and spaces:

  1. life exists on earth
  2. hLif eexist so neart
  3. kudw wzuara ib wlerg
  4. ne wvkdmtfcng cdjvgd

It is easy to see that the second contains the same information as the first, but with the letters moved one space to the right, with the spaces kept in the same place.

The third sentence is less obviously not random, but there is a hint that it might convey the same information in that the word lengths are the same.  After a little time sat at a typewriter one might realise that the key is to type the letter to the right of the one in the sequence above on a standard UK keyboard.

The fourth sentence is indeed random.

When C or D intercepts a string of letters from A then they may attempt to decode the string without knowing the key.  For short strings this becomes impossible, but for longer strings it may be possible to find repeating patterns for instance that can be matched to known phrases.  We might look for the most common letter in the string and assume that it is the letter ‘e’ for instance, and so on.  And then we judge whether we have broken the code by whether the resulting new string of letters has any meaning.  But once again, C or D must be able to recognise the meaning when they see it.  They must for instance know the language that A and B speak – so they too have received some prior information by another route.

We can represent a string of DNA bases by a string of letters (we have immediately introduced a ‘code’ that needs a decoder by doing this of course).

From our scientific experimentation we have discovered that many of these strings contain information.  We have for example found that the machinery within the cell is able to convert the DNA string into proteins: the cell is able to decode the DNA.  Knowing that DNA is a code has led to a lot of effort aimed at identifying what it does; at decoding it. The first step has been trying to identify the complete code – hence the human Genome project.  Once the complete string has been generated then we can try to decode it.

According to the Human Genome Project website (http://www.ornl.gov/sci/techresources/Human_Genome/project/info.shtml) less than 2% of the complete string of human DNA actually contains the codes that define the amino acid sequences in proteins.  About half of the genome contains repeating sequences that don’t code for protein, and are often called ‘junk’ DNA; since it was unknown what they do, the initial response was to reject them as junk.  However, as the above-referenced site states: “Deriving meaningful knowledge from the DNA sequence will define research through the coming decades to inform our understanding of biological systems. This enormous task will require the expertise and creativity of tens of thousands of scientists from varied disciplines in both the public and private sectors worldwide.” Indeed, recent research by the Encode project suggests that most of the DNA is indeed useful, not for making proteins but being involved in controlling the process.

As an aside, the techniques used in the human genome project have been applied to identifying the bacteria that caused the Black Death. It seems that the DNA of bacteria that caused the Black Death is not so different from plague bacteria around today; perhaps we should be worried….  http://www.nature.com/nature/journal/vaop/ncurrent/full/nature10549.html

A question often asked is, “where does all the information come from?”

Much of what we do generates information.  Forensic science is highly developed at decoding clues to determine the likely course of events in criminal cases.  North American Indian trackers can follow people for many miles based on the information left by footprints.  Air crash investigators read the information left on the debris to try to determine what caused a given disaster.  The information is physically recorded in the ‘clues’, and our knowledge and intelligence is able to ‘activate’ the information.  In many cases the information can be traced back eventually to an intelligent source, although that cannot be concluded when for example decoding the information held in geological rock formations.

However, whilst all of these activities generate information, they are basically one-off events that need to be deciphered.  None generates the sort of information found in this sentence for instance.  None generate information in a code-like format of information; none generate a sequence of instructions.

In all of our daily experience of instructional information transfer, of codes and deciphering, the information has been generated by an intelligent mind.  So the question behind the question is, “is the information contained in the DNA code generated by an intelligent source?”

It is argued that an unintelligent machine cannot generate more information than is inherently within the machine.  For example, can we imagine a computer program coming up with an equation that has not been already programmed into it?  And it is then argued that the cell is a molecular machine and so unable to generate more information than is contained within it and hence there must be an external Intelligent Designer that has generated and implanted the information in the cell.  However, I don’t find these arguments thorough.

A cellular machine operates within an environment, so if for example a mutation causes a change in the information contained then the survival or death of the mutated cell will add the information that the mutation was good or bad; the good mutation survives and the bad fails and more information is added to the DNA. It seems to me that this is a perfectly adequate explanation for the generation of the information in DNA, and is completely consistent with the type of God I describe in “The God of Science”

The DNA enigma

DNA is amazing stuff.  A precisely structured sequence of base pairs that is unique to each of us as an individual.  A record of our ancestral history.  A template for the manufacture of our proteins.  The blueprint for each of us.

The human DNA chain of around 3 billion characters has been assembled over perhaps the last billion and a half years (from the first evidence of cells with a nucleus), and has changed with the changing animals that carried it, through perhaps a billion generations.

DNA appears to be the mechanism of inheritance, the instruction set that ensures that beneficial features from parents are transmitted to the offspring.  It appears to be the key that defines a naturalistic explanation of how we have come to be here.  But is it?

Is there enough information within DNA to define each of us?  Or is something more needed?

As we remember that each of us begins as a single fertilised cell containing the combined DNA from our father’s sperm and our mother’s egg, then let’s remind ourselves of what the information in the DNA is being asked to define.

  1. The precise geometric construction of our bodies:
    1. The position, shape, type and interconnection of each of our fifty trillion cells
    2. The complete development cycle, that is robust enough to cope with different environments and with physical damage.  A development cycle which maintains the living organism as a functional entity at each stage in the process
    3. Major systems, fully functioning and cooperating with each other
      1. Circulatory System
      2. Respiratory System
      3. Immune System
      4. Skeletal System
      5. Excretory System
      6. Urinary System
      7. Muscular System
      8. Endocrine System
      9. Digestive System
      10. Nervous System
      11. Reproductive System
      12. A fully programmed brain that can control the operation of the body, but that can also think, conceptualise, communicate, empathise, create works of art, music, appreciate beauty, love, hate, choose.  A brain that appears to have, and for all practical purposes has free will.

This is weighty stuff to place onto DNA.

Indeed, the functionality does not seem to match the information capacity of the DNA; the DNA of an amoeba is ten times longer than that of a human, yet the functionality is minimal in comparison.

Has familiarity bred contempt?  Do we see ourselves too superficially?  Have we lost our awe at our own construction?  Have we deluded ourselves into thinking that we understand?

Have we forgotten that all that we are physically began with that one cell?  One cell and its DNA, is it really sufficient to make a human?

Image