Bad Science, Worse Philosophy: the Quackery and Logic-Chopping of David Foster's The Philosophical Scientists (2000)
9. The Odds of Life Evolving by Chance
I have surveyed most of the strangeness of Foster's book. But I have left for last the two most crucial points: his 'odds of life' argument, upon which he bases his entire refutation of Darwinism, and his 'universe is mind' argument, upon which he bases his theory of god and creation. I will tackle the first here. On page 172 Foster himself claims that "the critical moment is when one finds proofs of an intelligence which exceeds human intelligence, and in this book the critical point was the realization of biological (improbable) specificity." His only examples are hemoglobin and the genome of the T4 bacteriophage, neither of which represents anything near the first living organism, and neither of which is believed by any genuine scientist to have formed spontaneously. Thus, Foster's entire book is based on one giant mistake: his argument is that hemoglobin and the T4 genome are too complex to have arisen by chance, natural selection proposes that they arose by chance, therefore natural selection is false. But his central premise is false: natural selection does not propose that hemoglobin and the T4 genome arose by chance. The undeniable evidence of evolution shows a gradual step-wise increase in complexity of living organisms over time, and this contradicts any assumption that complex life ever arises by chance. The only thing that happens 'by chance' according to the theory of natural selection is mutation, apart from the spontaneous creation of the first living organism from which descended all the rest (and just like all single-celled organisms, the first life did not have hemoglobin).
In other words, in all of his examples of the statistical improbability of evolution by natural selection, Foster fails to actually account for natural selection in his calculations. As Foster claims on page 178:
Specificities (improbabilities) now emerging from molecular biology involve numbers such as 1078,000 (Chapter 14), and these never could have evolved by Darwin's theory of natural selection.
But nowhere does he actually show that natural selection cannot produce such complex entities. All he does is calculate the odds of certain highly complex entities being produced by no process at all but random chance----in other words, without natural selection. He does a lot of math, using the same basic probability equation, throughout the heart of his book to prove this irrelevant point.
[On all statistics used this way, by every
I believe that Foster, like most creation scientists, is lazy. He does not want to actually face the monumental complexity of the problem he has set out to solve. Indeed, everyone is actually unable to do so, for it is a task that would challenge and even elude the most brilliant biochemists. To actually calculate the odds of 'life' developing from inanimate matter, one must be acquainted not only with a vast arrangement of data and know how to estimate all the statistical relationships involved, but one must even know things that no one on Earth presently knows, or ever may know. To begin with, to actually calculate the 'odds of life evolving by chance' one must calculate the odds of the first living (i.e. replicating) organism arising by chance. But no one knows what that first organism was, for it naturally had no bones and thus left no fossils, and it certainly would have been vastly overpowered and driven to extinction by its more advanced children who were born after successive mutation and selection. It is not even known if this first life was DNA-based, much less how complex it was [See Addendum C for the scientific point of view on this]. Foster makes no attempt to even guess at what this first creature might have been or how complex it was. His use of hemoglobin is misguided, for that molecule is only found in certain multicellular animals, and is thus a late, not an early development, and his use of the T4 is even more misguided, for despite its simple nature the T4 is a fairly advanced organism, and in no way resembles the first life (in fact, since it is dependent on bacteria for its life, it must have evolved after bacteria).
But even if Foster entreated a talented biochemist to estimate the simplest possible biochemical replicator, his task would only be beginning. The odds of such a replicator forming by chance would not be based, as Foster seems to think, on its complexity alone. The chances would have to be calculated based on the number of materials available (e.g. more than one different molecule may serve the same purpose at any given point in a chain), the probability that they will form into collectives (e.g. amino-acids naturally chain, water molecules do not), and the number of tests (e.g. the number of chemical reactions that occur in a given environment, and the number of times any kind of chain or collective is formed in the population). Foster is even more incapable of accounting for all these factors than I am. I would be surprised if even a biochemist of Nobel Prize stature could actually work out a realistic probability based on all the possible and required factors. It is worth noting that no Nobel Prize winner has ever even tried.
But the chances of abiogenesis would only be half the story. Evolution by natural selection would then have to account for all developments after that point, and to calculate the odds of any such outcome one would have to account for all three elements of the process of natural selection: mutation, reproduction, and selection. So, one must know and enter into their equations what the odds are of a mutation during replication (this will change with every organism), then one must know how many of the mutations will be beneficial (again, this will vary with every generation, and will even depend upon the environment at the time, since unfavorable mutations in one environ will be favorable in another), then one must calculate how quickly bad mutants and non-mutants will be killed or displaced by the good mutants taking over, which requires knowing the rate of reproduction as well as the ecological capacity of the environment, which will place an upper limit on the total population as well as its rate of reproduction. This requires juggling so many variables that I honestly do not believe it is even possible to attempt an accurate estimate of the odds of any organism evolving from any other. And anyone, like Foster, who claims to have reliably done so can readily be assumed to be either lying or mistaken.
Perhaps Foster thinks smaller means simpler. Perhaps he thinks a bacteria-consuming virus today resembles the first life on Earth. He would be wrong to assume this, for no biologist does. Nevertheless, that appears to be his assumption. On page 82 he rather arbitrarily introduces the T4 bacteriophage, because, Foster says, it is "a tiny creature which preys upon bacteria" and so "its DNA must be one of the smallest specimens." But any biologist will tell you that size is not a reliable predictor of the complexity of an organism's genome. Since the T4 eats bacteria, it must have evolved after bacteria, which means that its DNA (actually, it is a virus, which only has RNA) could be more complex than that of the bacteria it eats (though this happens not to be true, see Addendum H). And even the DNA of bacteria is incredibly advanced. In fact, all life presently on Earth is highly advanced, and is certainly far more complex than its ancestors of three billion or more years ago. Even the simplest amoeba is here today for no other reason than because it could hold its own for billions of years, a feat far surpassing any of the accomplishments of mankind.
I think it is safe to say that any time you hear someone waving around statistics about the improbability of life, you can rest assured that they know absolutely nothing about the matter at all. Their statistics are going to be all but worthless, because they cannot know what they really need to know in order to make such calculations. But even in Foster's case, the wrong statistics were calculated. As I have already noted, Foster only calculates the odds of spontaneous assembly, without regard for the natural propensities of certain molecules to bond with certain others, and most of all, without regard for the fact that natural selection works not through random assembly, but through a methodical process of mutation, reproduction, and selection. The actual equations needed to account even cursorily for natural selection forces are far more complex than Foster imagines. One must estimate a series of binomial probabilities by graphing a normal probability distribution. I will give examples of how this needs to be done below [Also, see Addendum D]. But even this, though it would be more correct than Foster's work, would only be a superficial estimate, devoid of the complex accounting outlined above that would really be necessary to give a true estimate of the probability. And, as I have already suggested, that may be a task far beyond the capabilities of any man or machine.
Foster's first feat of mathematical statistics produces the figure of:
1 in 8.066 x 1067
This is the chance of getting a specific arrangement of cards by spontaneous ordering. Now, a deck of cards is not a good model for natural selection because it is a strict set of 52 items that can never vary or increase, whereas in nature the number of available materials and the way they can be arranged is all but unlimited, so you could get, say, a deck of 52 aces of spades in nature, but you cannot simulate this in a deck of cards where only 4 different aces exist, and no more.
Consequently, all we can do is show the effect of some simple selection rule on successive shufflings of a deck of cards. On page 39 Foster looks at the odds of a deck shuffling into a complete sequence from high card to low, i.e. from the Ace of Spades down to the Two of Clubs. To simulate 'natural selection,' you need to account in some way for reproduction, mutation, and selection. For example, we could use a shuffling rule such that if the top two cards ever turn up in the right sequence they get to 'live' by reproducing themselves, and these are thus removed from the deck to begin adding up toward the final result. This simulates the link 'reproducing' itself and all other links being 'selected' out of the gene pool and reshuffled (i.e. killed by the hostile environment, to which only certain organizations are suitable). Then, whenever the shuffled deck turns up the next two cards in the right order, granting a greater survival advantage, they attach to the previous two and the sequence grows, and the rest are killed and the deck is shuffled again. The former will represent beneficial mutation, the latter harmful mutation. With only these simple rules, how many shuffles will it take to produce the outcome Foster wants?
The odds of getting exactly two cards in the right order depends upon the Law of Permutations:
1 / (nPr) = 1 / (n! / (n-r)!)
52P2 = 52!/(52-2)!
It follows that the odds are 1 in 2652, or 00.038%.
The odds of getting another shuffle with the right top two cards would then be:
1 / 50P2 = 1 / (50! / (50-2)!) = 1 / 2450
And so forth. The odds get better as the deck thins out. But let's cheat for Foster, and pretend the odds remain 1 in 2652 with every draw, as if the shuffled part of the deck were to refill itself with an endless supply of useless jokers. Even with these odds stacked against us (this produces a probability even worse than Foster's of reaching Foster's sequence in 26 straight draws: i.e. about 1 in ten to the ninetieth power), time is our friend. Because we are relying not on random assembly, but slow and methodical assembly over time, the more times we reshuffle, the less time it takes to reach our goal.
In fact, there is more than a 97% chance that we will reach Foster's sequence in only 100,000 shuffles, which means:
In contrast to Foster's prediction of:
This is calculated using sophisticated math perhaps beyond Foster's ken, although the details can be found in Mario Triola's Elementary Statistics (5th ed., 1993), pp. 250-6. I will remind you, I am referring to an introductory statistics textbook! This is not something only experts know. This is a method taught in the very first semester of statistical mathematics. The odds can be precisely determined using the binomial equation, solved for all necessary values, but the laws of normal probability distribution allow us to arrive at a reliable estimate with much less work. If you want to see why we would prefer the easy estimate, I will show you what the binomial equation looks like:
P(x) = n! * px * q(n-x) / (n-x)! * x!
Where x is the number of successes needed (in our case at least 26), n is the number of tries (I have arbitrarily chosen 100,000 tries), p is the chance of any one try being a success (as we already have figured, this is 0.00038), and q is the chance of any one try being a failure (i.e. based on the Law of Complementarity, this is 1 minus 0.00038, or 0.99962). To really press this home, the '!' symbol means 'factorial,' or the value of the number given times every whole number between that and zero (i.e. 6! equals 6*5*4*3*2*1, or 720). To calculate the odds, we would have to solve for P(x) for every value of x between 26 and 100,000. Since no one wants to work through such a monstrous equation nearly a hundred thousand times over, much less calculate the factorials of numbers in the tens of thousands, we will go the easy route.
If we charted all the possible results of this equation on a graph, including the results for all values below 26 as well as above, they would form a bell curve, with a mean value in the center equal to n*p, or 100,000*0.00038, which equals 38. This means that the most probable number of successes in 100,000 tries will be 38. We only need 26, but we are dealing with whole numbers, and so 26 is really the whole range from 25.5 to 26.5 (we will not bother with this distinction in the future, since it becomes insignificant when n exceeds 1000). The difference between 25.5 and 38 then gives us what is called a z-score, and that z-score tells us (via any standard z-score table) the probability of getting any value between the two (in actual fact, this equals the area under the graph, where the z-score equals the value along the x-axis). Since 38 is in the middle of the graph, the odds of getting any number of successes from 38 to 100,000 is a flat 50%. We will add that to the odds of getting any number of successes from 25.5 to 38 to arrive at a good estimate of the odds of getting any number of successes of 26 or more. The z-score equals (x-m)/s, where x is the minimum needed result (25.5), m is the average result (38), and s is the 'standard deviation,' which equals the square root of the product of n (100,000), p (0.00038), and q (0.99962). Doing the math, we get a z-score of -2.02. On any z-score table, this reveals a final result of 47.83%. Added to 50%, we get a 97.83% chance of getting at least 26 successes (which is all we need to complete Foster's sequence) after 100,000 shuffles. Indeed, the chances only begin to drop below 50% when you have fewer shuffles than 68,000.
How might this translate into biological terms? We can assume that each shuffle represents a mutation, and each successful shuffle a beneficial mutation. To show what this means in terms of a population of organisms, if we assume that only 1 in 5 billion births suffers a mutation, and that each generation consists of only 1 billion births, then there will be only one mutation every 5 generations. But if there is one generation every hour (as there would be in any colony of bacteria), then it will only take 57 years to produce Foster's sequence! That is because there is about a 98% chance that it will be produced after 100,000 mutations, and if there is a mutation every five hours, that works out to 57 years. Now, we have chosen what is perhaps an unfairly high chance of beneficial mutation in this example, because we are limited by the awkward fiction of a deck of cards. But it should already be clear that the sort of math needed to actually get anywhere on this problem is far more complicated than what Foster uses, and the results are quite different. We have also assumed that only one arrangement of cards has survival benefits----in reality, and in the information space of DNA genomes, the number of viable and advantageous sequences of genes is incalculable and could very well be nearly infinite, and the range will differ for every different environment.
Foster claims, beginning on page 52, that a million monkeys typing 318 random words per minute for a million years could never produce Wordsworth's poem Daffodils, which Foster describes as a sequence of 159 letters (he ignores spaces and punctuation). His calculations are correct as far as determining the odds of this occurring purely by chance. But, as I've already pointed out, evolution does not occur purely by chance----it occurs as the result of natural selection. So let's factor in what Foster has left out: natural selection. As already noted, natural selection is the combined effect of three forces: replication, mutation, and selection. How might this be simulated with the monkeys and their typewriters?
Let us imagine that the monkeys are typing at computers which are simulating a process of natural selection. Each keystroke represents a mutation, and any incorrect keystroke represents an unviable mutation----a failure, a genome that is easily and quickly killed by the environment----whereas any three consecutive correct keystrokes represents a robust survivor (in this arbitrary analogy, the only kind of order that can survive----an unrealistic limitation, but applicable to our abstract case). The computer will automatically erase ('kill off') any incorrect generations, but let live any correct one. How long will it take for the monkeys, aided by this natural selection, to produce Daffodils?
I have chosen three-letter sets, instead of single letters or letter pairs, so as to simulate the reality that most mutations are fatal, and only a scant few beneficial----nevertheless, we see that this does not matter, because only the viable ones reproduce and multiply anyway. That is the beauty of natural selection. This rule produces a rate of viable mutation of only 1/17,576, or 0.006%, much lower than in the card deck example above. Foster does a more accurate job by accounting for the variable probability of the letters in the poem. I am assuming the same odds for every letter in the poem, which is not as exact, but it is a more than reasonable approximation, and this is necessary for what we have to do. Foster uses the figure of one million monkeys typing for one million years, for a grand sum of:
1.67 x 1020 keystrokes
We'll make it even harder on ourselves. With that amount of work, if the conditions are assumed to be correct (i.e. if the computer is in fact selecting for Daffodils----i.e. if that and all its simpler ancestors are the only 'genomes' that can survive in our imaginary 'environment'), then success is actually guaranteed. So we'll stack the odds even more against us. What are the odds of one lone monkey, aided by natural selection, producing Daffodils after only 3 million keystrokes, or just one week of random typing according to Foster's generous assumptions? It will be easier to work in three-keystroke units, and with that in mind, using the same normal probability distribution as in the card deck example, we get the following values:
Now, if x = 53 (the number of correct triple-strokes needed to complete Daffodils), then z = (x-m)/s = -0.90, producing a percentage of 0.5 + 0.3159, or 81.59%. So, while Foster wants us to think that it will take a million monkeys an untold trillions of years to produce such a result, in fact, if we actually account for natural selection, it could take as little as one week for a single monkey! And this was assuming an even lower rate of beneficial mutation than in the card example above.
We might go further and solve another monkey problem dealt with by Foster, inherited from (supposedly) Huxley : the claim, paraphrased on pages 54 and 55, that six monkeys randomly typing for 'millions and millions of years' would produce all the books in the British Museum. Now, this claim is, as stated, false, and Foster rightly demolishes the assumption. What Huxley apparently forgot to consider was that such a result is not likely to happen by chance, but it is likely given the operation of natural selection. How likely? Foster's assumptions are these:
From this we can guess that the total number of correct keystrokes needed is about:
4.9 x 1011, or 1.63 x 1011 triple-strokes
We will use exactly the same 'natural selection' simulation as above, only now with 6 monkeys typing for only nine million years, in all producing about:
8.55 x 1015 keystrokes, or 2.85 x 1015 triple-keystrokes
Now, if x = 1.63 x 1011 (the number of correct triple-strokes needed), then z = (x-m)/s = -19,346.62. Now, a z-score of just -3.5 produces .4999, for a total probability of 99.99%. A z-score in the negative hundreds or thousands indicates a probability so near 100% that the odds are astronomical for the event not to happen! Now, if we give the monkeys a little over just eight million years, this score starts to reach and exceed a zero z-score, i.e. the odds begin to drop below 50%; and if we allow only seven million years, the odds become effectively zero. What does this mean? While randomly typing monkeys would be unlikely, as Foster says, to type even a single line from all the books in the British Museum, even given the life of the universe to type away, nevertheless those same six randomly-typing monkeys, when aided by natural selection, would be guaranteed to do it in only nine million years! In fact, we can be even more accurate than that: they would be virtually guaranteed to succeed some time between seven and nine million years, and probably neither sooner nor later.
Since we have already shown Foster's math to be totally irrelevant to natural selection, we hardly need to demonstrate the same point with his linchpin proof concerning hemoglobin and the T4 genome. Even though he praises this as the "critical point" that 'refutes' Darwinism (p. 172), he has in fact proved nothing at all about Darwinism. Not only does Darwinism specifically entail that these two molecules were not created by random chance----so that calculating the random chance of their creation is moot----but Foster has not even employed the statistical tools necessary to account for any natural selection forces in his equations. Nowhere does he account for reproduction, which allows a slow building-up from replicating bases instead of relying on a new reshuffling every time. Nor mutation and selection, which allow for change and improvement guided by environmental pressures.
But just to refute Foster's work completely, we should make some attempt to estimate the odds of these two molecules being produced in nature. As I have already said, such an estimate is always going to be hopelessly flawed by the fact that all the factors that must be accounted for cannot even be known, much less included in our equations. Above all, the tools I used on the deck of cards and the typing monkeys assumed that there was only one ideal end-product, and that all beneficial mutations would lead to that single end. In nature, there is no such 'ideal end.' There may be infinitely many ends, and those will also change as environments change, and any given advance does not tend toward any future final product, but is an independently viable genome with a near infinite array of possible advances still before it. Likewise, the above calculations assumed that all the mutations were additive, and incremental. We did not account for other kinds of mutations, such as those in which two independently functioning molecules accidentally combine to produce an entirely new function, or those in which a change occurs in the older bases. We also ignored the fact that mutations are not always in just a single gene or codon. Entire chromosomes or segments can be accidentally replicated, which can then create a new species with double the size of its previous genome. Such doubling already happens quite frequently in plants.
However, since we want to know whether a certain molecule at least can evolve through natural selection, we can employ the somewhat-valid fiction that the molecule in question is the one sole end of the actual selection processes that have occurred in its past, and estimate the frequency of only additive mutations. I, personally, know nothing about the science and biochemical history of hemoglobin (like most adaptations, it could have been born from another molecule which served some other function in earlier creatures), nor do I know anything about the DNA needed to produce it now. Neither, I imagine, does Foster. I also cannot even guess what the original ancestor was of the replicating structure of DNA needed to produce hemoglobin (or whatever protohemoglobin molecule preceded and eventually became hemoglobin). But even Foster failed to account for any of this. He treated the molecule as if it spontaneously created itself. I will repeat his own flawed assumption, while changing only one thing: I will include selection forces in my statistical calculation.
While Foster attempts to account for the varying probabilities of each amino-acid (though he made no attempt to explore any possible differences in the chaining-tendency among the various amino-acids), I will simplify this to the basic 20 types, each with an equal probability of appearing by mutation of 1 in 20, and I will follow his emendation that the number of relevant positions is 516 (cf. p. 83, but see also Addendum F). To this I will arbitrarily add some plausible, and conservative, terms of reproduction, mutation, and selection as follows: hemoglobin could have evolved in a population maintained at one million individuals, who produced a new generation every twenty years, with a rate of additive mutation in the hemoglobin molecule of, let's say, once per five billion births. These arbitrary guesses allow for one relevant mutation every 100,000 years, causing 10,000 mutations in a billion years. Given these assumptions, what are the odds of hemoglobin evolving within the span of a billion years?
Now, if x = 516 (the total number of correct mutations needed), then z = (x-m)/s = 0.73. With a positive z-score, the odds become less than 50%. In effect, the odds equal the complement of what we normally calculate. So, a z-score of 0.73 produces 0.2673, plus 0.5 for a sum of 0.7673, the complement of which is 1-0.7673, or 0.2327. In other words, there is a 23.27% chance of this sort of thing happening. Those are not bad odds. Hemoglobin is not so unnatural a molecule as Foster wants us to believe.
Foster begins his hemoglobin calculations on page 79, and the T4 calculations on page 82. Compare them with mine. Note that his calculated specificities have no bearing at all on the probability of something evolving by natural selection. Rather, the most important factor is the very thing Foster totally ignores: the rate of mutation----in particular, the rate of additive mutation. Ultimately, the complexity of the final result is almost entirely irrelevant to the chance it will be produced naturally. Can we marshal similar estimates for his T4 bacteriophage genome? The only difference is that the T4 is in fact a complete replicating organism. Foster says there are 61,538 codon triplets in the T4 genome, each with 20 varieties of equal probability. It probably did not begin with one codon, but we cannot even guess at what its real ancestor looked like genetically, much less the odds of the genesis of that first ancestor, so I will do the math as if it began with one codon. Likewise, once again I will ignore diversions from the path from proto-T4 to the full T4 genome, as well as mutations within the ancestral bases, and so on.
However, we do know that organisms like the T4 produce far larger populations, and generations span far smaller periods of time, than any organisms which contain hemoglobin. It would even be outrageously conservative to say that the proto-T4, as well as the T4, has had an average population census of one trillion at any given time, and that a completely new generation is produced on average at least once a day (see Addendum G). We can even handicap ourselves with a rate of additive mutation of once per 200 quadrillion reproductions. Even then we will still see one such mutation per two million generations, for a rate of one relevant mutation every 548 years. Of these, as we have already noted, bad mutations will vanish with no effect, but useful ones will persist, and will rapidly acquire the same population statistics. Indeed, a single T4 mutant, with the given estimates, will reach the trillion population mark in less than two days----this is perhaps why evolution sometimes appears punctuated, for it can take centuries or even millennia for an advantage to be gained through new mutations, but once gained it can be exploited even to the point of total dominance in a matter of days or years in single-celled populations, a mere instant of geological time.
At any rate, given the above guesses, what are the odds of the T4 genome arising through natural selection within one billion years?
Now, if x = 61,538 (the total number of correct mutations needed), then z = (x-m)/s = -100.9. In other words, the evolution of the T4 bacteriophage is essentially guaranteed to occur in less than a billion years, provided all the conditions are right, and the rates of reproduction and mutation are as estimated above. However, if the math is done, and our estimations are correct, it can be shown that the evolution of an organism like the T4 must take more than 500 million years, since the odds within that period of time are close enough to zero for the event to be regarded as virtually impossible. In fact, the odds start to drop below 50% when the time falls under 675 million years. Of course, all of that changes if we become more realistic, and credit the T4 ancestors with populations in the thousands of trillions, generations by the hour or even the minute, and relevant mutations at a rate of one every several trillions of replications.
Note from Above:
 I use the phrase "supposedly" because Foster does not make this claim directly, but in a quote by James Jeans from his 1930 book The Mysterious Universe. Jeans says only that he "thinks" Huxley said it, and does not say where or when he said it if he did. Foster proceeds to imply a connection between this remark and a famous debate in 1860 between Huxley and Wilberforce, and I have heard this connection repeated by other people. But if Huxley said it, it could not have been in that 1860 debate, for, as a clever reader pointed out to me, typewriters were not invented until 1873 (see Typewriters, Qwerty & Typing).
|Top of Page|