Posts Tagged ‘Book Review’

Darwinian Evolution is a form of Computational Learning

The punchline of this book is perhaps: “Changing or increasing functionality of circuits in biological evolution is a form of computational learning“; although it also speaks of topics other than evolution, the underlying framework is of the Probably Approximately Correct model [1] from the theory of Machine Learning, from which the book gets its name.

"Probably Approximately Correct" by Leslie Valiant

“Probably Approximately Correct” by Leslie Valiant

[Clicking on the image above will direct you to the amazon page for the book]

I had first heard of this explicit connection between Machine Learning and Evolution in 2010 and have been quite fascinated by it since. It must be noted, however, that similar ideas have appeared in the past. It won’t be incorrect to say that usually they have been in the form of metaphor. It is another matter that this metaphor has generally been avoided for reasons I underline towards the end of this review. When I first heard about the idea it immediately made sense and like all great ideas, in hindsight looks completely obvious. Ergo, I was quite excited to see this book and preordered it months in advance.

This book is basically a popular science version and expansion of the ideas on the matter that appeared in a J-ACM article titled “Evolvability” in 2007 [2]. I have to say that I was expecting a bit more from the book than what it already had, in this sense I was somewhat disappointed. But at the same time, it can perhaps be a useful ‘starting’ book for those interested in the broad idea but without much background. We all have examples of books like this. Here’s one from me: about a couple of years ago I read a book on Knots by Alexei Sossinsky, the book got bad reviews from many serious mathematicians for not really being about mathematics, besides having many mistakes. But for a complete beginner, I thought it was an absolutely wonderful book, introducing the reader to the basic ideas very informally besides sharing his infectious enthusiasm for problems in Knot Theory. Coming back, in short if you have enough background of the basic notions of PAC Learning then the book might feel quite redundant in some chapters, but depending on how you read, it might still turn out to be a good read any way. Judging from some other (rude) reviews: If you are a professional nitpicker then this book is probably not worth your time since this book deals with ideas and not with a formal treatment of them. Given what it is aiming at, I think it is a good book.

Before attempting to sketch a skiagram of the main content of the book: One of the main subthemes of the book, constantly emphasized is to look at computer science as a kind of an enabling tool to study natural science. This is oft ignored, perhaps because of the reason that CS curricula are rarely designed with any natural science component in them and hence there is no impetus for aspiring computer scientists to view them from the computational lens. On the other hand,  the relation of computer science with mathematics has become quite well established. As a technology the impact of Computer Science has been tremendous. All this is quite remarkable given the fact that just about a century ago the notion of a computation was not even well defined. Unrelated to the book: More recently people have started taking the idea of digital physics (i.e. physics from a solely computable/digital perspective) seriously. But in the other natural sciences its usage is still woeful. Valiant draws upon the example of Alan Turing as a natural scientist and not just as a computer scientist to make this point. Alan Turing was more interested in natural phenomenon (intelligence, limits of mechanical calculation, pattern formation etc) and used tools from Computer Science to study them, a fact that is evident from his publications. That Turing was trying to understand natural phenomenon was remarked in his obituary by Max Neumann by summarizing the body of his work as: “the extent and the limitations of mechanistic explanations of nature”.


The book begins with a delightful and quite famous quote by John von Neumann (through this paragraph I also discovered the context to the quote). This paragraph also adequately summarizes the motivation for the book very well:

“In 1947 John von Neumann, the famously gifted mathematician, was keynote speaker at the first annual meeting of the Association for Computng Machinery. In his address he said that future computers would get along with just a dozen instruction types, a number known to be adequate for expressing all of mathematics. He went on to say that one need not be surprised at this small number, since 1000 words were known to be adequate for most situations in real life, and mathematics was only a small part of life, and a very simple part at that. The audience responded with hilarity. This provoked von Neumann to respond: “If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is”

Though counterintuitive, von Neumann’s quip contains an obvious truth. Einstein’s theory of general relativity is simple in the sense that one can write the essential content on one line as a single equation. Understanding its meaning, derivation, and consequences requires more extensive study and effort. However, this formal simplicity is striking and powerful. The power comes from the implied generality, that knowledge of one equation alone will allow one to make accurate predictions about a host of situations not even connected when the equation was first written down.

Most aspects of life are not so simple. If you want to succeed in a job interview, or in making an investment, or in choosing a life partner, you can be quite sure that there is no equation that will guarantee you success. In these endeavors it will not be possible to limit the pieces of knowledge that might be relevant to any one definable source. And even if you had all the relevant knowledge, there may be no surefire way of combining it to yield the best decision.

This book is predicated on taking this distinction seriously […]”

In a way, aspects of life as mentioned above appear theoryless, in the sense that there seems to be no mathematical or scientific theory like relativity for them. Something which is quantitative, definitive and short. Note that these are generally not “theoryless” in the sense that there exists no theory at all since obviously people can do all the tasks mentioned in a complex world quite effectively. A specific example is of how organisms adapt and species evolve without having a theory of the environment as such. How can such coping mechanisms come into being in the first place is the main question asked in the book.

Let’s stick to the specific example of biological evolution. Clearly, it is one of the central ideas in biology and perhaps one of the most important theories in science that changed the way we look at the world. But inspite of its importance, Valiant claims (and correctly in my opinion) that evolution is not understood well in a quantitative sense. Evidence that convinces us of its correctness is of the following sort: Biologists usually show a sequence of evolving objects; stages, where the one coming later is more complex than the previous. Since this is studied mostly via the fossil record there is always a lookout for missing links between successive stages (As a side note: Animal eyes is an extremely fascinating and very readable book that delves with this question but specifically dealing with the evolution of the eye. This is particularly interesting given that due to the very nature of the eye, there can be no fossil records for them). Darwin had remarked that numerous successive paths are necessary – that is, if it was not possible to trace a path from an earlier form to a more complicated form then it was hard to explain how it came about. But again, as Valiant stresses, this is not really an explanation of evolution. It is more of an “existence proof” and not a quantitative explanation. That is, even if there is evidence for the existence of a path, one can’t really say that a certain path is likely to be followed just because it exists. As another side note: Watch this video on evolution by Carl Sagan from the beautiful COSMOS

Related to this, one of the first questions that one might ask, and indeed was asked by Darwin himself: Why has evolution taken as much time as it has? How much time would have sufficed for all the complexity that we see around us to evolve? This question infact bothered Darwin a lot in his day. In On the Origins of Species he originally gave an estimate of the Earth’s age to be at least 300 million years, implying indirectly, that there was enough time. This estimate was immediately thrashed by Lord Kelvin, one of the pre-eminent British physicists of the time, who estimated the age of the Earth to be only about 24 million years. This caused Darwin to withdraw the estimate from his book. However, this estimate greatly worried Darwin as he thought 24 million years just wasn’t enough time. To motivate on the same theme Valiant writes the following:

“[…] William Paley, in a highly influential book, Natural Theology (1802) , argued that life, as complex as it is, could not have come into being without the help of a designer. Numerous lines of evidence have become available in the two centuries since, through genetics and the fossil record, that persuade professional biologists that existing life forms on Earth are indeed related and have indeed evolved. This evidence contradicts Paley’s conclusion, but it does not directly address his argument. A convincing direct counterargument to Paley’s would need a specific evolution mechanism to be demonstrated capable of giving rise to the quantity and quality of the complexity now found in biology, within the time and resources believed to have been available. […]”

A specific, albeit more limited version of this question might be: Consider the human genome, which has about 3 billion base pairs. Now, if evolution is a search problem, as it naturally appears to be, then why did the evolution of this genome not take exponential time? If it would have taken exponential time then clearly such evolution could not have happened in any time scale since the origin of the earth. Thus, a more pointed question to ask would be: What circuits could evolve in sub-exponential time (and on a related note, what circuits are evolvable only in exponential time?). Given the context, the idea of thinking about this in circuits might seem a little foreign. But on some more thought it is quite clear that the idea of a circuit is natural when it comes to modeling such systems (at least in principle). For example: One might think of the genome as a circuit, just as how one might think of the complex protein interaction networks and networks of neurons in the brain as circuits that update themselves in response to the environment.

The last line is essentially the key idea of adaptation, that entities interact with the environment and update themselves (hopefully to cope better) in response. But the catch is that the world/environment is far too complex for simple entities (relatively speaking), with limited computational power, to have a theory for. Hence, somehow the entity will have to cope without really “understanding” the environment (it can only be partially modeled) and improve their functionality. The key thing to pay attention here is the interaction or the interface between the limited computational entity in question and the environment. The central idea in Valiant’s thesis is to think of and model this interaction as a form of computational learning. The entity will absorb information about the world and “learn” so that in the future it “can do better”. A lot of Biology can be thought of as the above: Complex behaviours in environments. Wherein by complex behaviours we mean that in different circumstances, we (or alternately our limited computational entity) might behave completely differently. Complicated in the sense that there can’t possibly be a look up table for modeling such complex interactions. Such interactions-responses can just be thought of as complicated functions e.g. how would a deer react could be seen as a function of some sensory inputs. Or for another example: Protein circuits. The human body has about 25000 proteins. How much of a certain protein is expressed is a function depending on the quantities of other proteins in the body.  This function is quite complicated, certainly not something like a simple linear regression. Thus there are two main questions to be asked: One, how did these circuits come into being without there being a designer to actually design them? Two, How do such circuits maintain themselves? That is, each “node” in protein circuits is a function and as circumstances change it might be best to change the way they work. How could have such a thing evolved?

Given the above, one might ask another question: At what rate can functionality increase or adapt under the Darwinian process? Valiant (like many others, such as Chaitin. See this older blog post for a short discussion) comes to the conclusion that the Darwinian evolution is a very elegant computational process. And since it is so, with the change in the environment there has to be a quantitative way of saying how much rate of change can be kept up with and what environments are unevolvable for the entity. It is not hard to see that this is essentially a question in computer science and no other discipline has the tools to deal with it.

In so far that (biological) interactions-responses might be thought of as complicated functions and that the limited computational entity that is trying to cope has to do better in the future, this is just machine learning! This idea, that changing or increasing functionality of circuits in biological evoution is a form of computational learning, is perhaps very obvious in hindsight. This (changing functionality) is done in Machine Learning in the following sense: We want to acquire complicated functions without explicitly programming for them, from examples and labels (or “correct” answers). This looks at exactly at the question at how complex mechanisms can evolve without someone designing it (consider a simple perceptron learning algorithm for a toy example to illustrate this). In short: We generate a hypothesis and if it doesn’t match our expectations (in performance) then we update the hypothesis by a computational procedure. Just based on a single example one might be able to change the hypothesis. One could draw an analogy to evolution where “examples” could be experiences, and the genome is the circuit that is modified over a period of time. But note that this is not how it looks like in evolution because the above (especially drawing to the perceptron example) sounds more Lamarckian. What the Darwinian process says is that we don’t change the genome directly based on experiences. What instead happens is that we make a lot of copies of the genome which are then tested in the environment with the better one having a higher fitness. Supervised Machine Learning as drawn above is very lamarckian and not exactly Darwinian.

Thus, there is something unsatisfying in the analogy to supervised learning. There is a clear notion of a target in the same. Then one might ask, what is the target of evolution? Evolution is thought of as an undirected process. Without a goal. This is true in a very important sense however this is incorrect. Evolution does not have a goal in the sense that it wants to evolve humans or elephants etc. But it certainly does have a target. This target is “good behaviour” (where behaviour is used very loosely) that would aid in the survival of the entity in an environment. Thus, this target is the “ideal” function (which might be quite complicated) that tells us how to behave in what situation. This is already incorporated in the study of evolution by notions such as fitness that encode that various functions are more beneficial. Thus evolution can be framed as a kind of machine learning problem based on the notion of Darwinian feedback. That is, make many copies of the “current genome” and let the one with good performance in the real world win. More specifically, this is a limited kind of PAC Learning.  If you call your current hypothesis your genome, then your genome does not depend on your experiences. Variants to this genome are generated by a polynomial time randomized Turing Machine. To illuminate the difference with supervised learning, we come back to a point made earlier that PAC Learning is essentially Lamarckian i.e. we have a hidden function that we want to learn. One however has examples and labels corresponding to this hidden function, these could be considered “queries” to this function. We then process these example/label pairs and learn a “good estimate” of this hidden function in polynomial time. It is slightly different in Evolvability. Again, we have a hidden “ideal” function f(x). The examples are genomes. However, how we can find out more about the target is very limited, since one can empirically only observe the aggregate goodness of the genome (a performance function). The task then is to mutate the genome so that it’s functionality improves. So the main difference with the usual supervised learning is that one could query the hidden function in a very limited way: That is we can’t act on a single example and have to take aggregate statistics into account.

Then one might ask what can one prove using this model? Valiant demonstrates some examples. For example, Parity functions are not evolvable for uniform distribution while monotone disjunctions are actually evolvable. This function is ofcourse very non biological but it does illustrate the main idea: That the ideal function together with the real world distribution imposes a fitness landscape and that in some settings we can show that some functions can evolve and some can not. This in turn illustrates that evolution is a computational process independent of substrate.

In the same sense as above it can be shown that Boolean Conjunctions are not evolvable for all distributions. There has also been similar work on real valued functions off late which is not reported in detail in the book. Another recent work that is only mentioned in passing towards the end is the study of multidimensional space of actions that deal with sets of functions (not just one function) that might be evolvable together. This is an interesting line of work as it is pretty clear that biological evolution deals with the evolution of a set of functions together and not functions in isolation.


Overall I think the book is quite good. Although I would rate it 3/5. Let me explain. Clearly this book is aimed at the non expert. But this might be disappointing to those who bought the book because of the fact that this recent area of work, of studying evolution through the lens of computational learning, is very exciting and intellectually interesting. The book is also aimed at biologists, and considering this, the learning sections of the book are quite dumbed down. But at the same time, I think the book might fail to impress most of them any way. I think this is because generally biologists (barring a small subset) have a very different way of thinking (say as compared to the mathematicians or computer scientists) especially through the computational lens. I have had some arguments about the main ideas in the book over the past couple of years with some biologist friends who take the usage of “learning” to mean that what is implied is that evolution is a directed process. It would have been great if the book would have spent more time on this particular aspect. Also, the book explicitly states that it is about quantitative aspects of evolution and has nothing to do with speciation, population dynamics and other rich areas of study. However, I have already seen some criticism of the book by biologists on this premise.

As far as I am concerned, as an admirer of Prof. Leslie Valiant’s range and quality of contributions, I would have preferred if the book went into more depth. Just to have a semi-formal monograph on the study of evolution using the tools of PAC Learning right from the person who initiated this area of study. However this is just a selfish consideration.



[1] A Theory of the Learnable, Leslie Valiant, Communications of the ACM, 1984 (PDF).

[2] Evolvability, Leslie Valiant, Journal of the ACM, 2007 (PDF).


Read Full Post »

I wrote rather passionately about J. Robert Oppenheimer in one part of a previous post marking an anniversary of the first nuclear bombings. Those interested in Oppenheimer’s life might be interested in reading that part.

I read American Prometheus: The Triumph and Tragedy of J. Robert Oppenheimer (Buy it) by Kai Bird and Martin Sherwin between 14-29 October. And it is amongst the best biographies I have read. It is a very different book, it is not inspiring like some biographies on great men are. It is not philosophical or based on some mad pursuit. I have never seen such a well researched biography before. Sometimes I found the details about Oppenheimer’s personal life nauseating, but that just indicates the amount of surveillance he was under.

This book is not inspiring, it is haunting*.


[American Prometheus- Kai Bird, Martin Sherwin. Image Source ]

A Pulitzer prize winner (2006), this book chronicles the life of physicist, administrator, poet, American patriot and the father of the atomic bomb from his very early days to his last. The authors delve into every aspect of Oppenheimer’s life from his birth, to his time in England and Germany as a student. His leftist days at Berkeley, his love interests, his role in the Manhattan project, his public humiliation in the security hearings of 1954 at the height of the red scare and his quiet life after that, till his death. Twenty five years of research by the authors culminated into a devastatingly sad biography of one of the most famous men of all time.

The prologue in the book quotes George Kennan (1904-2005) the famous American diplomat now known as the father of the containment policy during the red scare as saying the following about him:

“In the dark day of the early fifties, when troubles crowded in upon him from many sides and when he found himself harassed by his position at the center of controversy, I drew his attention to the fact that he would be welcomed in a hundred academic centers abroad and asked him whether he had not thought of taking academic residence outside this country. His answer, given to me with tears in his eyes: ‘Damn it, I happen to love this country’

The book is divided into 5 parts and 40 chapters, each part covering a stage in his life. I would request all who get the opportunity to read this book, even those who have personal enmity with science and scientists but read once in a while about general things.

Some sections of the book are striking and stunning and represent major turning points in the book and just stand out.

While reading I observed that the authors place very clearly facts about the “Apple Incident” in which I was interested ever since I read about it, where in a state of depression and enormous emotional distress in his troubled student days at Cambridge, England in the autumn of 1925 Oppenheimer placed a “poisoned” apple on the desk of his head tutor, Patrick Blackett (few people know that Blackett was an adviser to Jawaharlal Nehru). The authorities learned of the incident and after some talking with his parents he was allowed to stay at Cambridge subject to the condition that he visit the psychiatrist as per a mandated chart. Oppenheimer gradually recovered and thrived in his golden intellectual days at Gottingen, Germany shortly afterwards when he produced some fundamental research and became known as one of the best quantum physicists of the time. The time bracket from this period at gottingen to his days as the Director of the Manhattan Project were his best days. By the end of 1945 Oppenheimer was one of the most famous physicists of the time.


[The above photo of Oppenheimer’s porkpie hat was the cover of Physics Today in May 1948, exemplifying the high regard and respect Oppenheimer commanded at that time. Image Soure]

Taking a brief digression, I’d cite two of my favorite quotes from the book:

After the emotional turmoil of his student days in England, Oppenheimer was always trying to be above that. His guiding principles were discipline and work. Quoting him from a letter to his brother.

Discipline is good for the soul is more fundamental than any of the grounds given for its goodness. I believe that through discipline, though not through discipline alone, we can achieve serenity, and a certain small but precious measure of freedom from the accidents of incarnation…I believe that through discipline we learn to preserve what is essential to our happiness in more and more adverse circumstances, and to abandon with simplicity what would else have seemed to us indispensable.

There is another fantastic quote attributed to P.A.M Dirac, the legendary physicist who was known to be eccentrically single minded to his dedication of science. Once Oppenheimer gave him several books as a gift. Dirac politely refused and remarked:

Reading books interfered with thought.

The Beast In the Jungle: I mentioned just a paragraph earlier that there were some very strong parts in the book that represented a watershed in his life. Some of them were just stunning. Like this part named “The Beast In the Jungle“. Until this section of the book Oppenheimer had been doing fine even though the number of his enemies were increasing. It was after this chapter that his decline began or rather accelerated.

The authors mention that Oppenheimer had harbored a vague sort of a premonition that something dark was in store for him in the future for a while. In the late 40s he read a novella by Henry James written in 1903 titled “The Beast in the Jungle“. James’ story was one of obsession, tormented egotism and as the authors put it, of existential foreboding. The story is very short, I have placed a link below to read the book from, for those interested. I could finish it under a couple of hours only. The basic plot of this story is as (from wikipedia):

John Marcher, the protagonist, is reacquainted with May Bartram, a woman he knew ten years earlier, who remembers his odd secret: Marcher is seized with the belief that his life is to be defined by some catastrophic or spectacular event, lying in wait for him like a “beast in the jungle.” May decides to buy a house in London with the money she got from her Great-Aunt who passed away, and to spend her days with Marcher curiously awaiting what fate has in store for him. Marcher is a hopeless egoist, who believes that he is precluded from marrying so that he does not subject his wife to his “spectacular fate”.

He takes May to the theatre and invites her to an occasional dinner, but does not allow her to get close to him. As he sits idly by and allows the best years of his life to pass, he takes May down as well, until the denouement where he learns that the great misfortune of his life was to throw it away, and to ignore the love of a good woman, based upon his preposterous sense of foreboding.

Oppenheimer was struck by the charge of the story and asked his friend Herb Marks to read it. After the bombings of Hiroshima and Nagasaki, as the authors write citing evidence Oppenheimer lived with a similar feelings that someday he would be struck by his “beast in the jungle” which would alter his whole life. Oppenheimer knew he was kept track of and that people in the Govt and the intelligence were looking for evidence to destroy him. As it turned out, his beast in the jungle was Lewis Strauss, who destroyed him.

Einstein and Oppenheimer: The parts on Einstein in the book were interesting.

Oppenheimer and Einstein at IAS[Einstein and Oppenheimer at IAS: Image source]

The book mentions Einstein in numerous instances throughout. However the ones related to Oppenheimer are the ones that I intend to touch upon.

Einstein could not understand why Oppenheimer was so keen on maintaining access to Washington and the echelons of the government. Einstein by instinct disliked politicians, and figures of authority. I quoted him in one previous post on this:

To punish me for my contempt of authority, fate made me an authority myself.

Einstein was always uncomfortable with attention is a well known fact to all those who admire him. There was this unforgettable quote in the book by him. On Einstein’s 71st birthday, Oppenheimer was walking him to his residence and Einstein said:

You know, when it’s once been given to a man to do something sensible, afterward life is a little strange.

Einstein suggested to Oppenheimer that he resign just taking into account the sheer outrageousness of the attacks on him. Einstein had left Germany when the Nazi nationalist frenzy swept the country and never set his foot on Germany again. He believed that the rise of McCarthyism in America was alarming, and he thought that Oppenheimer would end up humiliating himself.

As the authors note that Einstein’s instincts were right. He confided to a friend:

Oppenheimer is not a gypsy like me, I was born with the skin of an Elephant; there is no one who can hurt me.

and he thought that Oppenheimer was the reverse.

I would once again suggest the book to everyone who can set his/her hands on it. I would also congratulate the authors for the extremely gripping and compelling biography, it is amazing to think that the authors could piece together research spread over 25 years in such a wonderful way and I am already not able to get myself to complete a 60 page report on a project based on Support Vector Machines, in which I don’t even have to “piece together” things.

* This line is in no way borrowed from, or inspired by the newsweek review on this book.


1. American Prometheus: The Triumph and Tragedy of J. Robert Oppenheimer – Kai Bird and Martin Sherwin

2. The Beast in the Jungle – Henry James (Download it from Project Gutenberg)

Onionesque Reality Home >>

Read Full Post »