PROBABILITY
Though some have taken this tiny probability as an argument for “creation science,” the only thing it clearly indicates is that monkeys seldom write great plays.
—John Allen Paulos, Innumeracy, p. 75
CHANCE
Statistically, no matter how complex a sequence may be, given enough time, random chance selection of the elements can create/evolve the complex sequence. As an example used to illustrate this concept it has been said that a thousand monkeys banging on a thousand typewriters, given enough time will eventually “write” Shakespeare, the “tiny probability” referred above. The sequence usually in question is the DNA sequence in chromosomes, the random chance, mutation.
So what is the chance that a monkey randomly tapping keys on a typewriter would “write” Hamlet? Actually the odds are rather easy to calculate. If X equals the number of symbols (letters, spaces, and punctuation) on the keyboard that may be randomly chosen and Y equals that number of symbols/characters make up Hamlet, then assuming each symbol has an equal chance of being chosen, X^{Y} equals the number of possible combinations, one of which is Hamlet. (Sorry about the use of scientific notation but it is the only way to deal with the large numbers. X^{Y} is like 2^{2} which means 2×2=4, 10 to any power is simply 10 followed by that number of zeros, 10^{9} equals 1,000,000,000 or 1 billion.) For X, I came up with 48 as a rough estimate of the number of characters (74 if you want to worry about capitalization), what you might include or not include, may change the count, raising or lowering the number a few one way or another. For our purpose here 48 is good enough, 45 or 50 will change the end result by a large amount but it won’t really matter in the end. How many characters are there in Hamlet, 50,000, 100,000, or 1,000,000? We’re talking characters (letters, numbers, punctuation, and the spaces between them) not words. A 100,000 characters could be 15,000 or 20,000 words. Hamlet has about 250,000 words (the Bible about 850,000). If the average word has five letters, then Hamlet has maybe 1,250,000 letters plus punctuation. Just for argument’s sake let’s use 1,500,000 characters for Y, so X^{Y}=48^{1,500,000}. This is a really big number, I mean really big. I saw somewhere the number possible combinations the four nucleic acids that make up DNA could form in human chromosomes, the number exceeds the number of atoms in the universe. This number is certainly greater. A lot of monkeys banging on a lot of typewriters would (theoretically, in all probability) write Hamlet^{1} but eternity will probably come to an end before it happened.
This is supposed to convince or reassure people that evolution is true? What it does show is that most people don’t really comprehend evolution, natural selection, or random mutation. They fixate on the random part and ignore the rest. The only thing random in evolution is what mutation occurs and when it occurs. Selection for or against a particular gene/mutation is not random.
A more accurate analogy using monkeys banging on typewriters would be a monkey randomly typing a sequence of symbols on a computer keyboard. When the number of symbols equals the length of the Shakespeare’s play, the computer compares the random sequence to the target sequence (Hamlet in our example), the ones that match, the computer “keeps” (positive selection), the ones that don’t match are deleted (selected against) and replaced by new randomly selected symbols. Statistically with 48 symbols and 250,000 characters (“chances”), random “mutation” will be right 1 in 48 times, the first time should produce 5208 correct symbols. The computer repeatedly compares the randomly typed symbols to the target sequence (equivalent to a generation), selecting the ones that match and replacing the ones that don’t until the entire sequence is correct. In this analogy probability is not X^{Y} but XY (X times Y, X=48 and Y=1,500,000). XY is a much smaller number (much smaller) than X^{Y}. That is, it would take only about 72,000,000 cycles, analogous to generations, to write Hamlet, starting from nothing.
But even this analogy isn’t realistic, “evolution” didn’t start with a long random sequence, selecting those that matched some “target” and replacing only those that don’t match. We’re talking about the evolution of a living, functioning organism, the sequence has to make some sense and it starts with a working sequence.
I don’t want to get into a long, detailed discussion of the origin of life from nonliving chemicals and trust me, a detailed discussion would be long. All known living organisms are complex. They are all cellular in form, with their life substances and processes concentrated within a container separated from and protected from the outside world. We have trouble conceiving of a life form that isn’t contained in a cell. In the world today the conditions are such that the chemicals needed to sustain or start life are quickly consumed by any one of a number of living organisms and can never accumulate in such numbers or concentration (the “primordial soup”) to provide the conditions for the spontaneous generation of life.
In theory, a short chain of nucleic acids (these acids can form naturally under conditions of the early Earth and will form short chains of RNA or DNA) can replicate itself in a solution containing nucleic acids. Any sequence that also alters its environment in a way that increases its ability to replicate itself will do so faster than other sequences and it will continue to replicate automatically. We have here an evolutionary system that works by natural selection. It may not yet be “life” but it exists, reproduces, and competes against others.
There is, in round numbers, about 1,000,000,000 years between the formation of the Earth and the earliest known evidence of living cells (life as we know it). One billion years for the molecules of nucleic acids to naturally form, collect in short chains, only one of which had to replicate faster than natural processes destroyed it. This is all that is needed to start life.
So a more accurate analogy would start with a short sequence that is “sensible,” say the word “Hamlet.” At each “generation” there would be produced a large number of copies, each one would have a small chance of containing a mutation. The mutation could be a substitution of one or several symbols for one of the existing ones, duplication of one or more of the symbols or part of the sequence. Deletion of symbols or parts of the sequence is also an allowable mutation.
The copies are “checked” for accuracy or sensibility. Those copies which are accurate or sensible are selected for and used to produce another generation of copies. The nonsensible copies are selected against and eliminated. Sensible means only that the copy forms words or sentences, not necessarily that it means something that makes sense, such as: “To be or not to be, that is the question”. “Be or not be?” would be sensible.
At the first generation: Hamlet (an accurate copy), ham omelet, hamomelet, Hamlets, even spam omelet (which makes no sense) are “sensible” copies, but for example Z?Hamlet is not “sensible.” Each generation produces more copies, the accurate or sensible copies going on to reproduce there own copies or altered versions.
You should be able to see that very quickly there would be large numbers of sequences, many different sequences some very weird and wondrous. All these sequences would go on replicating, mutating, evolving and one will eventually become Hamlet, also the biological equivalents to War and Peace or The Cat in the Hat. Chance would play a big role in the actual evolutionary path followed and the actual results. I almost said “end results,” but there is no end. If you started it over from the beginning, even using the same sequence to start, you would still get entirely different results, functional but different. It would be like getting “Spamlet, Quince of Markland.” What results from a change is contingent to what preceded it. Stephen Jay Gould has written about the role of history in evolution and how improbable any one exact result (i.e. Hamlet, Prince of Denmark) is. Yes, someone will win the lottery, that is certain but the likelihood, the probability, that it will be you (one precise outcome) is less than a monkey writing Hamlet. “Something” had to evolve that, it is “us” is very unlikely. So “who” didn’t evolve? Someone has to win, odds are it will be someone else and not you.
Probability is not well understood by most people. You are far more likely to get in a car accident than win the lottery. I mean someone, actually a lot of “someones,” will be killed in car accidents for everyone who wins the lottery. However we still drive and buy lottery ticket, thinking we have a real chance of winning, all the time ignoring the very real chance of being killed driving, and the very real chance of losing the lottery.
I found the odds of winning the Powerball lottery on a CBS news broadcast (8/22/01). The estimated odds of winning are 1:80,000,000^{2}. No one won the 8/22/01 drawing, four tickets won the next drawing but the man who bought $6,000 worth of tickets wasn’t one of them. The odds of winning the McDonald’s Monopoly game are estimated at 1:440,000,000. The odds of winning the “Big Money Lottery” is 1;76,000,000. CBS reported (8/20/01) that the chance of being injured on an amusement park roller coaster ride is 1:23,000,000 and there are 300,000,000 visitors each years. The odds of being injured enough to visit a hospital by an amusement park ride is 1:10,000,000. The Discovery Channel stated that the odds of being struck by lightening are 1:600,000 (months later on a newscast I heard the odds of being killed by lightening were 1:3,000,000). If you really think that you can win the lottery, don’t go out in the rain or go swimming in sharkinfested water.
ODDS AND AVOGADRO’S NUMBER
If everyone understood probability,
Las Vegas and the lotteries would be out of business.
What does it mean to say that the odds of life spontaneously occurring are 1:100,000,000,000,000 (1×10^{15} or 1 in 100 trillion or 100 million million^{3})? Just for the sake of discussion, we’ll start with this number, we can change it later. When we say that the chance of a coin, being flipped, coming up heads is 1:2, we mean that if you flip a coin it is as likely to be “heads” as it is “tails” for that one flip. Flip it again and the chance is still 1:2. If the chance of winning a particular lottery is 1:10,000,000 and the lottery is repeated each week and you get a ticket each week. The chance of one of those tickets winning in 50 weeks is 50:10,000,000 (1:200,000). If you buy a ticket for 5,000,000 weeks (96,153 years and 10½ months) you have a 1:2 chance of having won. The chances are per event. If the lottery was twice a day
motor vehicle accident

1:500

fire

1:800

passenger aircraft crash

1:20,000

flood

1:30,000

tornado

1:60,000

venomous bite or sting

1:100,000

meteorite

1:3000 to 1:250,000

fireworks accident

1:1,000,000

food poisoning

1:3,000,000

from Chapman, Clark R., and David Morrison 1989
Cosmic Catastrophes, Plenum Press, New York.

Table 1. Odds (in the US) of dying from various causes.
it would take only 6849 years and 4 months for 5,000,000 events. If you played once a second it would be 10,000,000 seconds, 115.74 days, to reach a 1:1 probability (virtual certainty) of winning (actually not 1:1 since the same number can be drawn again and maybe you chose a different number each time).
So if the chance of life spontaneously forming is 1×10^{15} for each event what does that mean? Is the formation of the Earth one event or are we talking about the chance encounter of various atoms and molecules in some “primordial soup” of chemicals where each event is an encounter between any two or more atoms and/or molecules?
There is a number I learned in 11^{th} grade chemistry: 6.023×10^{23}. Actually I couldn’t remember the correct number any more so I had to look it up (the Internet is good for some things). (It annoys me that I couldn’t remember it.) Anyway this is the number of atoms or molecules in a quantity of matter of that atom or molecule equal, in grams, to the combined atomic weight of the atom or molecule (AKA gram molecular weight or a mole). This number is known as Avogadro’s Number. Carbon has an atomic weight of 12, so 12 grams of pure carbon has 6.023×10^{23} atoms of carbon. Water (H_{2}O) has an atomic weight of 18 (16 for oxygen and 2 for the hydrogen). Eighteen grams of water is one mole of water and has 6.023×10^{23} molecules. That’s about eight teaspoons, 18 cubic centimeters, a 2 × 3 × 3 centimeter cube, about one cubic inch give or take a few trillion molecules. A small quantity of matter contains a very large number of atoms and/or molecules, all interacting on a molecular scale. If each of these is one event, how many are occurring each second? If you think about an explosion (either a chemical like TNT or nuclear fission like an atomic bomb) trillion × trillions of “events” occur in a fraction of a second. But life proceeds at a slower pace. For the sake of a discussion lets say one atom/molecule interacts with another atom/molecule once every second. Furthermore, conditions on Earth were only able to support life beginning four billion years ago (a half a billion years after the formation of the Earth). How many events would that atom/molecule have been in after half a billion years? How about 157,781,680,000,000,000 events (1.59×10^{17}) so a 1:1×10^{15} chance event has 1.59×10^{2} or 159 chances of having occurred, not 1:159 but 159:1, not extremely rare, or rare but certain. A 1:1,000,000,000,000,000,000,000,000 (1 in a trillion trillion, 1×10^{24}) has a 1:1,000,000,000 chance of occurring, better chances than you have of winning some lotteries. That’s just for one atom or molecule.
Fortunately we do not need to consider all the atoms in the Universe or even all on Earth. We don’t even have to consider all the atoms/molecules in the ocean; we only need to consider the atoms/molecules in the “primordial soup,” some small pocket(s) of nutrient rich broth in a small and protected spot. It would not be a pure sample of any one atom/molecule. It would be a mixture of atoms and molecules in various numbers and it would not be possible to calculate what a mole of it would weight, or even necessary to. A mole of water is eighteen grams. Some of the simpler molecules could have moles of several 100 grams, a soup composed of several dozen molecules plus atoms, and each atom and molecule was present in a small percent of Avogadro’s number (½% is 3.0115×10^{20} not 3.0115×10^{23}) might weight several 10,000 grams (several 10s of kilograms, 22s of pounds). We could have 7.5×10^{21} atoms/molecules in a few gallons of soup, the number of events (interactions) could be on the order of 3.75×10^{21} per second, a trillion trillion events in less than 2,666 seconds, that’s less than 45 minutes (3.75×10^{21}×1.59×10^{17}=5.96×10^{37}). Very improbable events occurring frequently over extremely long periods of time are not unlikely, they are certain.
These “calculations” (estimates) require that all interactions be random and have an equal probability, which they don’t. Some reactions are more likely to occur, some will increase in likelihood as earlier interactions provide precursor molecules, etc. What we would start with is a broth of atoms and simple molecules (carbon, hydrogen, nitrogen, CO_{2}, H_{2}O), build up to slightly more complex molecules (ammonia, etc., see the UreyMiller experiment). Then even more complex molecules (simple proteins). It started with a few atoms, increases in number of atoms, then molecules, then the more you have the more you get and in geometric fashion the chain reaction accelerates, more reactions breeds more and more complex results, more often. The spontaneous evolution of life is not a “miracle” needing divine intervention. Once conditions on Earth reached the appropriate state, life was bound to evolve, given enough time, a few hundred million or half a billion years is enough.
The calculations of the microscopically small chance of a spontaneous origin of life are based on the chance that a protein or amino acid will be created by the random meeting of a number (10s to 100s) of atoms, all combinations having an equal probability. Amino acids combine to form proteins and there can be 50 to 50,000 amino acids in a single protein molecule.
Glycine is the simplest amino acid. Glycine has an amino group NH_{2}, a carboxyl group (COOH), and two hydrogen atoms attached to a carbon. More complex amino acids may have seven amino groups, a second carboxyl group, a carbon ring, sulfur, more nitrogen, hydrogen, and/or carbon. These can form rather readily as the UreyMiller experiment demonstrated. It is claimed that proteins that are more difficult, or are they? Proteins may be composed of 100s of atoms, but those atoms are grouped into amino acids. Proteins do not have to form from random collections of atoms, their building blocks are amino acids. Once they are formed, proteins are very, very probable.
Any discussion of the probability of the spontaneous occurrence of life is pointless. I only discussed it to show that the really minuscule chances are countered by the sheer number of “events.” Besides there is a large difference between “extremely improbable” and “notatall probable”. Creationists have abused probability. They assume, unstated, that all events are equally probable, independent of each event, and (biggest of all) it all happened in a single step. We really have not got enough information, or facts, to calculate the probability of life with any degree of accuracy or meaning. Extremely low probability is not the same as impossible. Extremely low probability may often be an indication that we lack enough relevant data to accurately calculate the probability of the event occurring. Besides, even if life appearing in this form is extremely improbable that doesn’t actually say much about the probability of life occurring in any form.
If you bought one ticket in a lottery where the chance of winning is 1:80,000,000. It would be very improbable that you bought the winning ticket. The odds that anybody else bought the winning ticket are the same. The odds of a specific, single ticket being a winner is 1:80,000,000, every last one of them has the same chance of winning. Also the same chance of losing: 79,999,999:80,000,000. Yet nobody is surprised that a ticket wins. Nobody claims that God was necessary for a winning number to occur. The odds that there would be a winning number was 1:1 (well actually not because not all numbers may be issued and some numbers are issued more than once). I have seen the figure of 37% as the number of lotteries in which the winning number was issued on a ticket. This is why some of the lotteries rise to such large payouts, nobody won for several drawings. So the saying: “Somebody has to win,” is actually; “If they repeat it enough somebody will eventually win.” But for the sake of argument we will assume that for this lottery the winning number is drawn from those tickets sold. Nobody is surprised that there was a winning ticket. The probability of your ticket winning is 1:80,000,000. The probability of any winning ticket is 80,000,000:80,000,000 (1:1). “But,” you say, “there is a difference between the chance of a winning ticket being drawn and the chance that the winning ticket being your ticket.” Yes there is and when Creationists calculate the probability of the spontaneous emergence of life, what they are calculating is the is the spontaneous emergence of this particular life (the chances of your ticket winning or as with the monkeys: Hamlet the Prince of Denmark) and not the chances of the spontaneous emergence of any life (the chances of there being a winning ticket: Hamlet the Prince of Denmark or Spamlet the Quince of Markland). They equate (without making it clear) that the chances of having a winning ticket are the same as the chance there is a winning ticket. The odds are not the same and to say that the odds that life spontaneously emerged in this exact form are so low as to be impossible and to require divine assistance is irrelevant to the whole question.^{5}
Here’s a couple of quick probabilities: We know of one inhabitable planet, Earth. Life exists on it so the probabilities of life are 1:1. Or, there are 9 known planets, one has life so the odds are 1:9. Of course not all planets are inhabitable^{6} so maybe there is one solar system we know about and there is life in it, so again 1:1 odds. Of course its all ridiculous and not relevant. We don’t know enough to calculate the odds.
I know about Drake’s Equation (N=R_{*}f_{o}n_{o}f_{l}f_{i}f_{c}•L, N=the probable number of technological civilizations at any one time with intelligent life forms that are capable of space exploration). There are no numbers just seven unknown quantities which if we known we could calculate the answer. It is not the same as the number of planets with life forms. The answer is at least one (us here on Earth) and could be as high as billions, given the number of stars in the Galaxy, let alone the Universe. There are billions of galaxies with billions of stars with, probably, planets.
AGAINST ALL ODDS
Time and again it is pointed out that something is impossible to have happened because of the hugely astronomical odds against it happening (e.g. the spontaneous origin of life). Well the odds against you existing (the “you” meaning the specific, exact sequence of DNA, genes, various alleles of those genes, etc.) are so astronomically improbable that, by such reasoning, you cannot possibly exist, neither you, I, nor the six billion plus other humans.
Each of us is a precise combination of genes that, except for an exact twin, is unique in the entire world. No other human, living or dead or yet to be born, has the exact same genes. I am not going to calculate the actual odds of any exact combination occurring, I will provide some indication of the probable range of probability.
First, each individual is the result of a single act of conception: the combination of genetic material from one particular egg and one particular sperm. Each egg and sperm is the product of meiosis. During meiosis the DNA in a cell is divided in half so that the gamete that is formed has onehalf of the normal genetic material to match with onehalf of the genetic material from the other parent, thus providing the fertilized zygote with the proper amount of DNA.
During the meiotic division the chromosomes assort and randomly divide into halves. The two halves of each chromosome do not automatically go with the same halves of all the other chromosomes. Humans have 23 “pairs” of chromosomes (22 pairs and either two X or an X and Y chromosome. The number possible combinations of just the chromosomes is 2^{23}, with two parents that makes 2^{23} × 2^{23} possible combinations, that is 2^{46}. But chromosomes do more than randomly reassort during meiosis, for example they also exchange parts of their genetic material with other chromosomes, invert sections, and mutate. These various actions function to reassort genes on the chromosomes themselves. Over time the genes can be nearly as randomly sorted as the chromosomes they are on.
I have seen estimates that the human body may have 30,000 genes. These are active genes, not the inactive, unexpressed “junk” that forms the majority of our DNA. (We have millions and millions of base pairs of DNA.) Each gene can exist in different versions: alleles. The major blood type (the ABO system) has three alleles (A, B, and O), there is also the Rhesus Factor (Rh and Rh+), the Duffy system, MNS, Kell, Diego, Lutheran, and sicklecell anemia. However these are more different genes than allelic variants of the ABO system. So rather than each gamete having a one of 2^{23} possible combinations, the number might be closer (assuming two alleles for each gene and only calculating the active genes, not including the junk DNA) a 2^{30,000} combinations, or for a fertilized zygote: 2^{60,000}. However, what are the chances that your parents are the two that they are. With about six billion people now living, theoretically your parents could have been any two of these six billion. Assuming a 50:50 sex ratio that is 2^{3,000,000} possible parental pairs, each a different gene combination, one of 2^{30,000} possibilities. That’s 2^{3,000,000} × 2^{30,000}=2^{3,030,000} possible gene sequences at fertilization. All the dead, living, and yet to be born humans have hardly exhausted the possible combinations.
Realistically there is not an equal chance that anyone human will have children with any other human of the opposite sex or that each child will have different parents. Parents have to meet before they can mate, so they generally have grown up near each other (and share similar genes) and they have more than one child. This reduces the odds some. Not much and if we add in the junk DNA and possible mutation events (and how can those be accurately figured in?) the possibility of any one particular combination occurring goes up astronomically.
This is why I’m not calculating the probability, I can’t do the math and I’m not sure anyone else can, at least, not calculate a realistic number. All that really matters is that the probability is really, really, hugely improbable. You, my reader, are the improbably improbable result, you, me, and six billion other people and the hundreds of new babies being born each minute (or is it thousands each second?).
Impossibly improbable? Yet it happens repeatedly. Perhaps you are thinking that I have done something wrong in estimating the probability. Maybe I have been sneaky or underhanded. Well, yes and no. I have tried to honestly calculate the probability. And yes I have been sneaky. The probability was for someone to be born with a single exact DNA sequence, yours for instance, not the chances that someone would be born with any DNA sequence, that probability is one, it will happen. People buy lottery tickets (not me), because someone will win, it could be them, but it mostly won’t. The problem is one of the probability of an exact result specified in advance versus the probability of any result.
The probability of the spontaneous origin of life as we know it may be extremely unlikely but that may not be the same as the probability of the spontaneous origin of any form of life.
The calculations of the probability of life spontaneously occurring have one thing in common, besides their highly improbable probability: they all assume it is life as we know it here on Earth. This is a specified outcome, a “special event,” identical to what are the odds of a specified number (generally the number bought in the lottery) will win the lottery, also improbably low. However, the odds of there being a winning number are high (because not all numbers are bought and not all tickets may have unique numbers, the actual odds of there being a winning ticket are about 37%, based on actual lottery practice and experience). We know that life, as we know it, did emerge here on Earth, so in one sense, discussing its probability that life emerged here on Earth is the life that emerged here on Earth. What is the probability of that what is is what is?
On the other hand, what really is being argued is not what is the probability that life spontaneously emerged but what is the probability that any form of life will emerge any where in the Universe? That is an entirely different probability, more likely the probability of there being a winning lottery number.
There are hundreds of billions of galaxies in the Universe, each with billions of stars, some fraction of them have planets and potentially have lifefriendly conditions. Some fraction of billions times hundreds of billions is a large number, so the odds aren’t that long.
It was Mark Twain^{7} who said that there were three kinds of lies: lies, damned lies, and statistics. Probability is a type of statistic. You need to be very certain what you are talking about. You are dealing with very large numbers of events multiplied by very long time frames that can make any probability factor meaningless, even if you have got the problem right in the first place.