Feeds:
Posts
Comments

Archive for the ‘Biology’ Category

Darwinian Evolution is a form of Computational Learning

The punchline of this book is perhaps: “Changing or increasing functionality of circuits in biological evolution is a form of computational learning“; although it also speaks of topics other than evolution, the underlying framework is of the Probably Approximately Correct model [1] from the theory of Machine Learning, from which the book gets its name.

"Probably Approximately Correct" by Leslie Valiant

“Probably Approximately Correct” by Leslie Valiant

[Clicking on the image above will direct you to the amazon page for the book]

I had first heard of this explicit connection between Machine Learning and Evolution in 2010 and have been quite fascinated by it since. It must be noted, however, that similar ideas have appeared in the past. It won’t be incorrect to say that usually they have been in the form of metaphor. It is another matter that this metaphor has generally been avoided for reasons I underline towards the end of this review. When I first heard about the idea it immediately made sense and like all great ideas, in hindsight looks completely obvious. Ergo, I was quite excited to see this book and preordered it months in advance.

This book is basically a popular science version and expansion of the ideas on the matter that appeared in a J-ACM article titled “Evolvability” in 2007 [2]. I have to say that I was expecting a bit more from the book than what it already had, in this sense I was somewhat disappointed. But at the same time, it can perhaps be a useful ‘starting’ book for those interested in the broad idea but without much background. We all have examples of books like this. Here’s one from me: about a couple of years ago I read a book on Knots by Alexei Sossinsky, the book got bad reviews from many serious mathematicians for not really being about mathematics, besides having many mistakes. But for a complete beginner, I thought it was an absolutely wonderful book, introducing the reader to the basic ideas very informally besides sharing his infectious enthusiasm for problems in Knot Theory. Coming back, in short if you have enough background of the basic notions of PAC Learning then the book might feel quite redundant in some chapters, but depending on how you read, it might still turn out to be a good read any way. Judging from some other (rude) reviews: If you are a professional nitpicker then this book is probably not worth your time since this book deals with ideas and not with a formal treatment of them. Given what it is aiming at, I think it is a good book.

Before attempting to sketch a skiagram of the main content of the book: One of the main subthemes of the book, constantly emphasized is to look at computer science as a kind of an enabling tool to study natural science. This is oft ignored, perhaps because of the reason that CS curricula are rarely designed with any natural science component in them and hence there is no impetus for aspiring computer scientists to view them from the computational lens. On the other hand,  the relation of computer science with mathematics has become quite well established. As a technology the impact of Computer Science has been tremendous. All this is quite remarkable given the fact that just about a century ago the notion of a computation was not even well defined. Unrelated to the book: More recently people have started taking the idea of digital physics (i.e. physics from a solely computable/digital perspective) seriously. But in the other natural sciences its usage is still woeful. Valiant draws upon the example of Alan Turing as a natural scientist and not just as a computer scientist to make this point. Alan Turing was more interested in natural phenomenon (intelligence, limits of mechanical calculation, pattern formation etc) and used tools from Computer Science to study them, a fact that is evident from his publications. That Turing was trying to understand natural phenomenon was remarked in his obituary by Max Neumann by summarizing the body of his work as: “the extent and the limitations of mechanistic explanations of nature”.

________________

The book begins with a delightful and quite famous quote by John von Neumann (through this paragraph I also discovered the context to the quote). This paragraph also adequately summarizes the motivation for the book very well:

“In 1947 John von Neumann, the famously gifted mathematician, was keynote speaker at the first annual meeting of the Association for Computng Machinery. In his address he said that future computers would get along with just a dozen instruction types, a number known to be adequate for expressing all of mathematics. He went on to say that one need not be surprised at this small number, since 1000 words were known to be adequate for most situations in real life, and mathematics was only a small part of life, and a very simple part at that. The audience responded with hilarity. This provoked von Neumann to respond: “If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is”

Though counterintuitive, von Neumann’s quip contains an obvious truth. Einstein’s theory of general relativity is simple in the sense that one can write the essential content on one line as a single equation. Understanding its meaning, derivation, and consequences requires more extensive study and effort. However, this formal simplicity is striking and powerful. The power comes from the implied generality, that knowledge of one equation alone will allow one to make accurate predictions about a host of situations not even connected when the equation was first written down.

Most aspects of life are not so simple. If you want to succeed in a job interview, or in making an investment, or in choosing a life partner, you can be quite sure that there is no equation that will guarantee you success. In these endeavors it will not be possible to limit the pieces of knowledge that might be relevant to any one definable source. And even if you had all the relevant knowledge, there may be no surefire way of combining it to yield the best decision.

This book is predicated on taking this distinction seriously […]”

In a way, aspects of life as mentioned above appear theoryless, in the sense that there seems to be no mathematical or scientific theory like relativity for them. Something which is quantitative, definitive and short. Note that these are generally not “theoryless” in the sense that there exists no theory at all since obviously people can do all the tasks mentioned in a complex world quite effectively. A specific example is of how organisms adapt and species evolve without having a theory of the environment as such. How can such coping mechanisms come into being in the first place is the main question asked in the book.

Let’s stick to the specific example of biological evolution. Clearly, it is one of the central ideas in biology and perhaps one of the most important theories in science that changed the way we look at the world. But inspite of its importance, Valiant claims (and correctly in my opinion) that evolution is not understood well in a quantitative sense. Evidence that convinces us of its correctness is of the following sort: Biologists usually show a sequence of evolving objects; stages, where the one coming later is more complex than the previous. Since this is studied mostly via the fossil record there is always a lookout for missing links between successive stages (As a side note: Animal eyes is an extremely fascinating and very readable book that delves with this question but specifically dealing with the evolution of the eye. This is particularly interesting given that due to the very nature of the eye, there can be no fossil records for them). Darwin had remarked that numerous successive paths are necessary – that is, if it was not possible to trace a path from an earlier form to a more complicated form then it was hard to explain how it came about. But again, as Valiant stresses, this is not really an explanation of evolution. It is more of an “existence proof” and not a quantitative explanation. That is, even if there is evidence for the existence of a path, one can’t really say that a certain path is likely to be followed just because it exists. As another side note: Watch this video on evolution by Carl Sagan from the beautiful COSMOS

Related to this, one of the first questions that one might ask, and indeed was asked by Darwin himself: Why has evolution taken as much time as it has? How much time would have sufficed for all the complexity that we see around us to evolve? This question infact bothered Darwin a lot in his day. In On the Origins of Species he originally gave an estimate of the Earth’s age to be at least 300 million years, implying indirectly, that there was enough time. This estimate was immediately thrashed by Lord Kelvin, one of the pre-eminent British physicists of the time, who estimated the age of the Earth to be only about 24 million years. This caused Darwin to withdraw the estimate from his book. However, this estimate greatly worried Darwin as he thought 24 million years just wasn’t enough time. To motivate on the same theme Valiant writes the following:

“[…] William Paley, in a highly influential book, Natural Theology (1802) , argued that life, as complex as it is, could not have come into being without the help of a designer. Numerous lines of evidence have become available in the two centuries since, through genetics and the fossil record, that persuade professional biologists that existing life forms on Earth are indeed related and have indeed evolved. This evidence contradicts Paley’s conclusion, but it does not directly address his argument. A convincing direct counterargument to Paley’s would need a specific evolution mechanism to be demonstrated capable of giving rise to the quantity and quality of the complexity now found in biology, within the time and resources believed to have been available. […]”

A specific, albeit more limited version of this question might be: Consider the human genome, which has about 3 billion base pairs. Now, if evolution is a search problem, as it naturally appears to be, then why did the evolution of this genome not take exponential time? If it would have taken exponential time then clearly such evolution could not have happened in any time scale since the origin of the earth. Thus, a more pointed question to ask would be: What circuits could evolve in sub-exponential time (and on a related note, what circuits are evolvable only in exponential time?). Given the context, the idea of thinking about this in circuits might seem a little foreign. But on some more thought it is quite clear that the idea of a circuit is natural when it comes to modeling such systems (at least in principle). For example: One might think of the genome as a circuit, just as how one might think of the complex protein interaction networks and networks of neurons in the brain as circuits that update themselves in response to the environment.

The last line is essentially the key idea of adaptation, that entities interact with the environment and update themselves (hopefully to cope better) in response. But the catch is that the world/environment is far too complex for simple entities (relatively speaking), with limited computational power, to have a theory for. Hence, somehow the entity will have to cope without really “understanding” the environment (it can only be partially modeled) and improve their functionality. The key thing to pay attention here is the interaction or the interface between the limited computational entity in question and the environment. The central idea in Valiant’s thesis is to think of and model this interaction as a form of computational learning. The entity will absorb information about the world and “learn” so that in the future it “can do better”. A lot of Biology can be thought of as the above: Complex behaviours in environments. Wherein by complex behaviours we mean that in different circumstances, we (or alternately our limited computational entity) might behave completely differently. Complicated in the sense that there can’t possibly be a look up table for modeling such complex interactions. Such interactions-responses can just be thought of as complicated functions e.g. how would a deer react could be seen as a function of some sensory inputs. Or for another example: Protein circuits. The human body has about 25000 proteins. How much of a certain protein is expressed is a function depending on the quantities of other proteins in the body.  This function is quite complicated, certainly not something like a simple linear regression. Thus there are two main questions to be asked: One, how did these circuits come into being without there being a designer to actually design them? Two, How do such circuits maintain themselves? That is, each “node” in protein circuits is a function and as circumstances change it might be best to change the way they work. How could have such a thing evolved?

Given the above, one might ask another question: At what rate can functionality increase or adapt under the Darwinian process? Valiant (like many others, such as Chaitin. See this older blog post for a short discussion) comes to the conclusion that the Darwinian evolution is a very elegant computational process. And since it is so, with the change in the environment there has to be a quantitative way of saying how much rate of change can be kept up with and what environments are unevolvable for the entity. It is not hard to see that this is essentially a question in computer science and no other discipline has the tools to deal with it.

In so far that (biological) interactions-responses might be thought of as complicated functions and that the limited computational entity that is trying to cope has to do better in the future, this is just machine learning! This idea, that changing or increasing functionality of circuits in biological evoution is a form of computational learning, is perhaps very obvious in hindsight. This (changing functionality) is done in Machine Learning in the following sense: We want to acquire complicated functions without explicitly programming for them, from examples and labels (or “correct” answers). This looks at exactly at the question at how complex mechanisms can evolve without someone designing it (consider a simple perceptron learning algorithm for a toy example to illustrate this). In short: We generate a hypothesis and if it doesn’t match our expectations (in performance) then we update the hypothesis by a computational procedure. Just based on a single example one might be able to change the hypothesis. One could draw an analogy to evolution where “examples” could be experiences, and the genome is the circuit that is modified over a period of time. But note that this is not how it looks like in evolution because the above (especially drawing to the perceptron example) sounds more Lamarckian. What the Darwinian process says is that we don’t change the genome directly based on experiences. What instead happens is that we make a lot of copies of the genome which are then tested in the environment with the better one having a higher fitness. Supervised Machine Learning as drawn above is very lamarckian and not exactly Darwinian.

Thus, there is something unsatisfying in the analogy to supervised learning. There is a clear notion of a target in the same. Then one might ask, what is the target of evolution? Evolution is thought of as an undirected process. Without a goal. This is true in a very important sense however this is incorrect. Evolution does not have a goal in the sense that it wants to evolve humans or elephants etc. But it certainly does have a target. This target is “good behaviour” (where behaviour is used very loosely) that would aid in the survival of the entity in an environment. Thus, this target is the “ideal” function (which might be quite complicated) that tells us how to behave in what situation. This is already incorporated in the study of evolution by notions such as fitness that encode that various functions are more beneficial. Thus evolution can be framed as a kind of machine learning problem based on the notion of Darwinian feedback. That is, make many copies of the “current genome” and let the one with good performance in the real world win. More specifically, this is a limited kind of PAC Learning.  If you call your current hypothesis your genome, then your genome does not depend on your experiences. Variants to this genome are generated by a polynomial time randomized Turing Machine. To illuminate the difference with supervised learning, we come back to a point made earlier that PAC Learning is essentially Lamarckian i.e. we have a hidden function that we want to learn. One however has examples and labels corresponding to this hidden function, these could be considered “queries” to this function. We then process these example/label pairs and learn a “good estimate” of this hidden function in polynomial time. It is slightly different in Evolvability. Again, we have a hidden “ideal” function f(x). The examples are genomes. However, how we can find out more about the target is very limited, since one can empirically only observe the aggregate goodness of the genome (a performance function). The task then is to mutate the genome so that it’s functionality improves. So the main difference with the usual supervised learning is that one could query the hidden function in a very limited way: That is we can’t act on a single example and have to take aggregate statistics into account.

Then one might ask what can one prove using this model? Valiant demonstrates some examples. For example, Parity functions are not evolvable for uniform distribution while monotone disjunctions are actually evolvable. This function is ofcourse very non biological but it does illustrate the main idea: That the ideal function together with the real world distribution imposes a fitness landscape and that in some settings we can show that some functions can evolve and some can not. This in turn illustrates that evolution is a computational process independent of substrate.

In the same sense as above it can be shown that Boolean Conjunctions are not evolvable for all distributions. There has also been similar work on real valued functions off late which is not reported in detail in the book. Another recent work that is only mentioned in passing towards the end is the study of multidimensional space of actions that deal with sets of functions (not just one function) that might be evolvable together. This is an interesting line of work as it is pretty clear that biological evolution deals with the evolution of a set of functions together and not functions in isolation.

________________

Overall I think the book is quite good. Although I would rate it 3/5. Let me explain. Clearly this book is aimed at the non expert. But this might be disappointing to those who bought the book because of the fact that this recent area of work, of studying evolution through the lens of computational learning, is very exciting and intellectually interesting. The book is also aimed at biologists, and considering this, the learning sections of the book are quite dumbed down. But at the same time, I think the book might fail to impress most of them any way. I think this is because generally biologists (barring a small subset) have a very different way of thinking (say as compared to the mathematicians or computer scientists) especially through the computational lens. I have had some arguments about the main ideas in the book over the past couple of years with some biologist friends who take the usage of “learning” to mean that what is implied is that evolution is a directed process. It would have been great if the book would have spent more time on this particular aspect. Also, the book explicitly states that it is about quantitative aspects of evolution and has nothing to do with speciation, population dynamics and other rich areas of study. However, I have already seen some criticism of the book by biologists on this premise.

As far as I am concerned, as an admirer of Prof. Leslie Valiant’s range and quality of contributions, I would have preferred if the book went into more depth. Just to have a semi-formal monograph on the study of evolution using the tools of PAC Learning right from the person who initiated this area of study. However this is just a selfish consideration.

________________

References:

[1] A Theory of the Learnable, Leslie Valiant, Communications of the ACM, 1984 (PDF).

[2] Evolvability, Leslie Valiant, Journal of the ACM, 2007 (PDF).

________________

Read Full Post »

A second post in a series of posts about Information Theory/Learning based perspectives in Evolution, that started off from the last post.

Although the last post was mostly about a historical perspective, it had a section where the main motivation for some work in metabiology due to Chaitin (now published as a book) was reviewed. The starting point about that work was to view evolution solely through an information processing lens (and hence the use of Algorithmic Information Theory). Ofcourse this lens by itself is not a recent acquisition and goes back a few decades (although in hindsight the fact that it goes back just a few decades is very surprising to me at least). To illustrate this I wanted to share some analogies by John Maynard Smith (perhaps one of my favourite scientists), which I had found to be particularly incisive and clear. To avoid clutter, they are shared here instead (note that most of the stuff he talks about is something we study in high school, however the talk is quite good, especially because it tends to emphasize on the centrality of information throughout). I also want this post to act as a reference for some upcoming posts.

Coda:

Molecular Biology is all about Information. I want to be a little more general than that; the last century, the 19th century was a century in which Science discovered how energy could be transformed from one form to another […] This century will be seen […] where it became clear that information could be translated from one from to another.

[Other parts: Part 2, Part 3, Part 4, Part 5, Part 6]

Throughout this talk he gives wonderful analogies on how information translation underlies the so called Central Dogma of Molecular Biology, and how if the translation was one-way in some stages it could have implications (i.e how August Weismann noted that acquired characters are not inherited by giving a “Chinese telegram translation analogy”; since there was no mechanism to translate acquired traits (acquired information) into the organism so that it could be propagated).

However, the most important point from the talk: One could see evolution as being punctuated by about 6 or so major changes or shifts. Each of these events was marked by the way information was stored and processed in a different way. Some that he talks about are:

1. The origin of replicating molecules.

2. The Evolution of chromosomes: Chromosomes are just strings of the above replicating molecules. The property that they have is that when one of these molecules is replicated, the others have to be as well. The utility of this is the following: Since they are all separate genes, they might have different rates of replication and the gene that replicates fastest will soon outnumber all the others and all the information would be lost. Thus this transition underlies a kind of evolution of cooperation between replicating molecules or in other other words chromosomes are a way for forced cooperation between genes.

3. The Evolution of the Code: That information in the nucleic could be translated to sequences of amino acids i.e. proteins.

4. The Origin of Sex: The evolution of sex is considered an open question. However one argument goes that (details in next or next to next post) the fact that sexual reproduction hastens the acquisition from the environment (as compared to asexual reproduction) explains why it should evolve.

5. The Evolution of multicellular organisms: A large, complex signalling system had to evolve for these different kind of cells to function in an organism properly (like muscle cells or neurons to name some in Humans).

6. Transition from solitary individuals to societies: What made these societies of individuals (ants, humans) possible at all? Say if we stick to humans, this could have only happened only if there was a new way to transmit information from generation to generation – one such possible information transducing machine could be language! Thus giving an additional mechanism to transmit information from one generation to another other than the genetic mechanisms (he compares the genetic code and replication of nucleic acids and the passage of information by language). This momentous event (evolution of language ) itself dependent on genetics. With the evolution of language, other things came by:  Writing, Memes etc. Which might reproduce and self-replicate, mutate and pass on and accelerate the process of evolution. He ends by saying this stage of evolution could perhaps be as profound as the evolution of language itself.

________________

As a side comment: I highly recommend the following interview of John Maynard Smith as well. I rate it higher than the above lecture, although it is sort of unrelated to the topic.

________________

Interesting books to perhaps explore:

1. The Major Transitions in EvolutionJohn Maynard Smith and Eörs Szathmáry.

2. The Evolution of Sex: John Maynard Smith (more on this theme in later blog posts, mostly related to learning and information theory).

________________

Read Full Post »

Long Blurb: In the past year, I read three books by Gregory Chaitin: His magnum opus – MetaMath! (which had been on my reading stack since 2007), a little new book – Proving Darwin and most of Algorithmic Information Theory (available as a PDF as well). Before this I had only read articles and essays by him. I have been sitting on multiple blog post drafts based on these readings but somehow I never got to finishing them up, perhaps it is time to publish them as is! Chaitin is an engaging and provocative writer (and speaker! see this fantastic lecture for instance) who really makes me think. It has come to my notice that a lot of people find his writing style uncomfortable (for apparently lacking humility and having copious hyperbole). While there might be some merits to this view, if one is able to look beyond these minor misgivings he always has a lot of interesting ideas. The apparent hyperbole is best viewed as effusive enthusiasm to get the most out of his writings (since there is indeed a lot to be mined). I might talk about this when I actually write about his books and the ideas therein. For now, this post was in part inspired by something I read in the second book: Proving Darwin.

I thought this was a thought provoking book, but perhaps was not suited to be a book, yet. It builds on the idea of DNA as software and tries to think of Evolution as a random walk in software space. The idea of DNA as software is not new. If one considers DNA to be software and the other biological processes to also be digital, then one is essentially sweeping out all that might not be digital. This view of biological processes might thus be at best an imprecise metaphor. But as Chaitin quotes Picasso ‘art is a lie that helps us see the truth‘ and explains that this is just a model and the point is to see if anything mathematically interesting can be extracted from such a model.

He eloquently points out that once DNA has been thought of as a giant codebase, then we begin to see that a human DNA is just a major software patchwork, having ancient subroutines common with fish, sponges, amphibians etc. As is the case with large software projects, the old code can never be thrown away and has to be reused – patched and made to work and grows by the principle of least effort (thus when someone catches ill and complains of a certain body part being badly designed it is just a case of bad software engineering!). The “growth by accretion” of this codebase perhaps explains Enrst Haeckel‘s slogan “ontogeny recapitulates phylogeny” which roughly means that the growth of a human embryo resembles the successive stages of evolution (thus biology is just a kind of Software Archeology).

All this sounds like a good analogy. But is there any way to formalize this vaguiesh notion of evolution as a random walk in software space and prove theorems about it? In other words, does a theory as elegant as Darwinian Evolution have a purely mathematical core? it must says Chaitin and since the core of Biology is basically information, he reckons that tools from Algorithmic Information Theory might give some answers. Since natural software are too messy, Chaitin considers artificial software in parallel to natural software and considers a contrived toy setting. He then claims in this setting one sees a mathematical proof that such a software evolves. I’ve read some criticisms of this by biologists on the blogosphere, however most criticise it on the premises that Chaitin explicitly states that his book is not about. There are however, some good critiques, I will write about these when I post my draft on the book.

________________

Now coming to the title of the post. The following are some excerpts from the book:

But we could not realize that the natural world is full of software, we could not see this, until we invented human computer programming languages […] Nobel Laureate Sydney Brenner shared an office with Francis Crick of Watson and Crick. Most Molecular Biologists of this generation credit Schrödinger’s book What is Life? (Cambridge University Press, 1944) for inspiring them. In his autobiography My Life in Science Brenner instead credits von Neumann’s work on self-reproducing automata.

[…] we present a revisionist history of the discovery of software and of the early days of molecular biology from the vantage point of Metabiology […] As Jorge Luis Borges points out, one creates one’s predecessors! […] infact the past is not only occasionally rewritten, it has to be rewritten in order to remain comprehensible to the present. […]

And now for von Neumann’s self reproducing automata. von Neumann, 1951, takes from Gödel the idea of having a description of the organism within the organism = instructions for constructing the organism = hereditary information = digital software = DNA. First you follow the instructions in the DNA to build a new copy of the organism, then you copy the DNA and insert it in the new organism, then you start the new organism running. No infinite regress, no homunculus in the sperm! […]

You see, after Watson and Crick discovered the molecular structure of DNA, the 4-base alphabet A, C, G, T, it still was not clear what was written in this 4-symbol alphabet, it wasn’t clear how DNA worked. But Crick, following Brenner and von Neumann, somewhere in the back of his mind had the idea of DNA as instructions, as software”

I did find this quite interesting, even if it sounds revisionist. That is because the power of a paradigm-shift conceptual leap is often understated, especially after more time has passed. The more time passes, the more “obvious” it becomes and hence the more “diffused” its impact. Consider the idea of a “computation”. A century ago there was no trace of any such idea, but once it was born born and diffused throughout we basically take it for granted. In fact, thinking of a lot of things (anything?) precisely is nearly impossible without the notion of a computation in my opinion. von Neumann often remarked that Turing’s 1936 paper contained in it both the idea of software and hardware and the actual computer was just an implementation of that idea. In the same spirit von Neumann’s ideas on Self-Reproducing automata have had a similar impact on the way people started thinking about natural software and replication and even artificial life (recall the famous von Neumann quote: Life is a process that can be abstracted away from any particular medium).

While I found this proposition quite interesting, I left this at that. Chaitin cited Brenner’s autobiography which I could not obtain since I hadn’t planned on reading it, google-books previews did not suffice either.  So i didn’t pursue looking on what Brenner actually said.

However, this changed recently, I exchanged some emails with Maarten Fornerod, who happened to point out an interview of Sydney Brenner that actually talks about this.

The relevant four parts of this very interesting 1984 interview can be found here, here, here and here. The text is reproduced below.

“45: John von Neumann and the history of DNA and self-replication:

What influenced me the most was the articles of von Neumann. Now, I had become interested in von Neumann not through the coding issues but through his interest in the nervous system and computers and of course, that’s what Seymour was interested in because what we wanted to do is find out how the brain worked; that was what… you know, like a hobby on the side, you know. After… after dinner we’d work on central problems of the brain, you see, but it was trying to find out how this worked. And so I got this symposium, the Hixon Symposium, which I had got to read. I was, at that time, very fascinated by a rather strange complexion of psychology things by a man called Wolfgang Köhler who was a gestalt psychologist and another man called Lewin, who was what was called a topological psychologist and of course they were talking at a very different level, but in this book there is an article by Köhler and of course in that book there’s this very famous paper which no one has ever read of von Neumann. Now of course later I discovered that those ideas were much older than the dating of this and that people had taken notes of them and they had circulated. But what is the brilliant part of this paper is in fact his description of what it takes to make a self-reproducing machine. And in fact if you look at what he says and what Schrödinger says, you can see what I have come to call Schrödinger’s fundamental error, and in fact we can… I can find you the passage in that. But it’s an amazing passage, because what von Neumann shows is that you have to have a mechanism not only of copying the machine but of copying the information that specifies the machine, right, so that he then divided the machine – the… the automaton as he called it – into three components: the functional part of the automaton; a… a decoding section of this which is part of that, which actually takes the tape, reads the instructions and builds the automaton; and a device that takes a copy of this tape and inserts it into the new automaton, right, which is the essential… essential, fundamental… and when von Neumann said that is the logical basis of self-reproduction, then you can see where Schrödinger made his mistake and this can be summarised in one sentence. Schrödinger says the chromosomes contain the information to specify the future organism and the means to execute it and that’s not true. The chromosomes contain the information to specify the future organisation and a description of the means to implement, but not the means themselves, and that logical difference is made so crystal clear by von Neumann and that to me, was in fact… The first time now of course, I wasn’t smart enough to really see that this is what DNA is all about, and of course it is one of the ironies of this entire field that were you to write a history of ideas in the whole of DNA, simply from the documented information as it exists in the literature, that is a kind of Hegelian history of ideas, you would certainly say that Watson and Crick depended on von Neumann, because von Neumann essentially tells you how it’s done and then you just… DNA is just one of the implementations of this. But of course, none knew anything about the other, and so it’s a… it’s a great paradox to me that in fact this connection was not seen. Linus Pauling is present at that meeting because he gives this False Theory of Antibodies there. That means he heard von Neumann, must have known von Neumann, but he couldn’t put that together with DNA and of course… well, neither could Linus Pauling put his own paper together with his future work because he and Delbrück wrote a paper on self… on self-complementation – two pieces of information – in about 1949 and had forgotten it by the time he did DNA, so that of course leads to a really distrust about what all the historians of science say, especially those of the history of ideas. But I think that that in a way is part of our kind of revolution in thinking, namely the whole of the theory of computation, which I think biologists have yet to assimilate and yet is there and it’s a… it’s an amazingly paradoxical field. You know, most fields start by struggling through from, from experimental confusion through early theoretical, you know, self-delusion, finally to the great generality and this field starts the other way round. It starts with a total abstract generality, namely it starts with, with Gödel’s hypothesis or the Turing machine, and then it takes, you know, 50 years to descend into… into banality, you see. So it’s the field that goes the other way and that is again remarkable, you know, and they cross each other at about 1953, you know: von Neumann on the way down, Watson and Crick on the way up. It was never put together.

[Q] But… but you didn’t put it together either?

I didn’t put it together, but I did put together a little bit later that, because the moment I saw the DNA molecule, then I knew it. And you connected the two at once? I knew this.

46: Schrodinger Wrong, von Neumann right:

I think he made a fundamental error, and the fundamental error can be seen in his idea of what the chromosome contained. He says… in describing what he calls the code script, he says, ‘The chromosome structures are at the same time instrumental in bringing about the development they foreshadow. They are law code and executive power, or to use another simile, they are the architect’s plan and the builder’s craft in one.’ And in our modern parlance, we would say, ‘They not only contain the program but the means to execute the program’. And that is wrong, because they don’t contain the means; they only contain a description of the means to execute it. Now the person that got it right and got it right before DNA is von Neumann in developing the logic of self-reproducing automata which was based of course on Turing’s previous idea of automaton and he gives this description of this automaton which has one part that is the machine; this machine is built under the instructions of a code script, that is a program and of course there’s another part to the machine that actually has to copy the program and insert a copy in the new machine. So he very clearly distinguishes between the things that read the program and the program itself. In other words, the program has to build the machinery to execute the program and in fact he says it’s… when he tries to talk about the biological significance of this abstract theory, he says: ‘This automaton E has some further attractive sides, which I shall not go into at this time at any length’.

47: Schrodinger’s belief in calculating an organism from chromosomes:

One of the central things in the Schrödinger’s little book What Is Life is what he has to say about the chromosomes containing the program or the code, and he says here… on page 20, he says: ‘In calling the structure of the chromosome fibres a code script, we mean that the all penetrating mind once conceived by Laplace, to which every cause or connection lay immediately open, could tell from their structure whether the egg would develop under suitable conditions, into a black cock, or into a speckled hen, into a fly, or a maize plant, a rhododendron, a beetle, a mouse or a woman’. Now, apart from the list of organisms he has given, which falls short of a serious classification of the living world, what he is saying here is that if you could look at the chromosomes, you could compute the… you could calculate the organism, but he’s saying something more. He’s saying that you could actually implement the organism because he goes on to add: ‘But the term code script is of course too narrow. The chromosome structures are at the same time instrumental in bringing about the development they foreshadow. They are law code and executive power, or to use another simile they are architect’s plan and builder’s craft in one.’ What he is saying here is that the chromosomes not only contain a description of the future organism but the means to implement that description or program as we might call it. That’s wrong, because they don’t contain the means to implement it, they only contain a description of the means to implement it and that distinction was made absolutely crystal clear by this remarkable paper of von Neumann, in which he develops and in fact provides a proof of what a self-reproducing automaton – or machine – would have to… would have to contain in order to satisfy the requirements of self-reproduction. He develops this concept of course from the earlier concept of Turing, who had developed ideas of automata that operated on tapes and which of course gave a mechanical example of how you might implement some computation. And if you like, this is the beginning of the theory of computation and what von Neumann has to say about this is made very clear in this essay. And he says that you’ve got to have several components, and I’ll just summarise them. He said you first start with an automaton A, ‘which, when furnished this description of any other automaton in terms of the appropriate functions, will construct that entity’. It’s like a Turing machine, you see. So automaton A has to be given a tape and then it’ll make another automaton A, all right. Now what you have, you’ve got to have… add to this automaton B. ‘Automaton B can make a copy of any instruction I that is furnished to it’, so if you give it a program, it’ll copy the program. And then you have… you combine A and B with each other and you give a control mechanism C, that you have to add as well, which does the following. It says, ‘Let A be furnished with the instruction I’. So you give… you give the tape to A, A copies it under the control of C, okay, by every description, so it makes another copy of A, then you… and of course B is part of A. Then you… C then tells B, ‘copy this tape I’, gives B the copying part of it, this copies it and then C will then take the tape I, and put it into the new machine, okay, and turns it loose as he says here, ‘as an independent entity’.

48: Automata akin to Living Cells:

What von Neumann says is that you need several components in order to provide this self-reproducing automaton. One component, which he calls automaton A, will make another automaton A when furnished with a description of itself. Then you need an automaton C… you need an automaton B, which has the property of copying the instruction tape I, and then you need a control mechanism which will actually control the switching. And so this automaton, or machine, can reproduce itself in the following way. The entity A is provided with a copy of… with the tape I, it now makes another A. The control mechanism then takes the tape I and gives it to B and says make a copy of this tape. It makes a copy of the tape and the control mechanism then inserts the new copy of the tape into the new automaton and effects the separation of the two. Now, he shows that the entire complex is clearly self-reproductive, there is no vicious circle and he goes on to say, in a very modest way I think, the following. He says, ‘The description of this automaton has some further attractive sides into which I shall not go at this time into any length. For instance, it is quite clear that the instruction I is roughly effecting the functions of a gene. It is also clear that the copying mechanism B performs the fundamental act of reproduction, the duplication of the genetic material which is also clearly the fundamental operation in the multiplication of living cells. It is also clear to see how arbitrary alterations of the system E, and in particular of the tape I, can exhibit certain traits which appear in connection with mutation, which is lethality as a rule, but with a possibility of continuing reproduction with a modification of traits.’ So, I mean, this is… this we know from later work that these ideas were first put forward by him in the late ’40s. This is the… a published form which I read in early 1952; the book was published a year earlier and so I think that it’s a remarkable fact that he had got it right, but I think that because of the cultural difference – distinction between what most biologists were, what most physicists and mathematicians were – it absolutely had no impact at all.”

________________

References and Also See:

1. Proving Darwin: Making Biology Mathematical, Gregory Chatin (Amazon).

2. Theory of Self-Reproducing Automata, John von Neumann. (PDF).

3. 1966 film on John von Neumann.

Read Full Post »

Changing or increasing functionality of circuits in biological evolution is a form of computational learning. – Leslie Valiant

The title of this post comes from Prof. Leslie Valiant‘s The ACM Alan M. Turing award lecture titled “The Extent and Limitations of Mechanistic Explanations of Nature”.

Prof. Leslie G. Valiant

Click on the image above to watch the lecture

[Image Source: CACM “Beauty and Elegance”]

Short blurb: Though the lecture came out sometime in June-July 2011, and I have shared it (and a paper that it quotes) on every online social network I have presence on, I have no idea why I never blogged about it.

The fact that I have zero training (and epsilon knowledge of) in biology that has not stopped me from being completely fascinated by the contents of the talk and a few papers that he cites in it. I have tried to see the lecture a few times and have also started to read and understand some of the papers he mentions. Infact, the talk has inspired me enough to know more about PAC Learning than the usual Machine Learning graduate course might cover. Knowing more about it is now my “full time side-project” and it is a very exciting side-project to say the least!

_________________________

Getting back to the title: One of the motivating questions about this work is the following:

It is widely accepted that Darwinian Evolution has been the driving force for the immense complexity observed in life or how life evolved. In this beautiful 10 minute video Carl Sagan sums up the timeline and the progression:

There is however one problem: While evolution is considered the driving force for such complexity, there isn’t a satisfactory explanation of how 13.75 billion years of it could have been enough. Many have often complained that this reduces it to a little more than an intuitive explanation. Can we understand the underlying mechanism of Evolution (that can in turn give reasonable time bounds)? Valiant makes the case that this underlying mechanism is of computational learning.

There have been a number of computational models that have been based on the general intuitive idea of Darwinian Evolution. Some of these include: Genetic Algorithms/Programming etc. However, people like Valiant amongst others find such methods useful in an engineering sense but unsatisfying w.r.t the question.

In the talk Valiant mentions that this question was asked in Darwin’s day as well. To which Darwin proposed a bound of 300 million years for such evolution to occur. This immediately fell into a problem as Lord Kelvin, one of the leading physicists of the time put the figure of the age of Earth to be 24 million years. Now obviously this was a problem as evolution could not have happened for more than 24 million years according to Kelvin’s estimate. The estimate of the age of the Earth is now much higher. ;-)

The question can be rehashed as: How much time is enough? Can biological circuits evolve in sub-exponential time?

For more I would point out to his paper:

Evolvability: Leslie Valiant (Journal of the ACM – PDF)

Towards the end of the talk he shows a Venn diagram of the type usually seen in complexity theory text books for classes P, NP, BQP etc but with one major difference: These subsets are fact and not unproven:

Fact: Evolvability \subseteq SQ Learnable \subseteq PAC Learnable

*SQ or Statistical Query Learning is due to Michael Kearns (1993)

Coda: Valiant claims that the problem of evolution is no more mysterious than the problem of learning. The mechanism that underlies biological evolution is “evolvable target pursuit”, which in turn is the same as “learnable target pursuit”.

_________________________

Onionesque Reality Home >>

Read Full Post »

I believe this needs to be tom-tommed.

 

[ … I have hunted butterflies in various climes and disguises: as a pretty boy in knickerbockers and sailor cap; as a lanky cosmopolitan expatriate in flannel bags and beret; as a fat hatless old man in shorts”. (Speak, Memory). Photo: Vladimir Nabokov as Lepidopterist. (From Nabokov Museum)]

Nabokov, Dostoevsky, Kafka and Camus would be four picks that I would make if I were to pick from the bag of my favourite non-science/logic authors. And each would be for an entirely different reason. I have always believed that Nabokov was the un-surpassable one when it came to intricate word play and constructing word-filigrees. I relate his remarkable ability to swim through words and conjuring up unimaginably beautiful lines from him being a synesthete, made even more remarkable by the fact that English was his third language.  That aside, Nabokov had a parallel existence as an auto-didactic lepidopterist with a serious interest in Chess problems. While I knew that he had an interest in butterflies (and wrote poems on them), I somehow thought it might be more of a dabble than a serious interest. But it turns out that it was much more than that. 60 years after he did so, his theory on a particular genus of butterflies turns out to be remarkably true. I found this story very interesting. Read more about this here (Nabokov Theory on Butterfly Evolution is Vindicated By Carl Zimmer).

 

Acmon blue butterfly, described by Nabokov in 1944. Photo from the New York Times

On Discovering a Butterfly (Nabokov, 1943).

I found it and I named it, being versed
in taxonomic Latin; thus became
…godfather to an insect and its first
describer — and I want no other fame.

Wide open on its pin (though fast asleep),
and safe from creeping relatives and rust,
in the secluded stronghold where we keep
type specimens it will transcend its dust.

Dark pictures, thrones, the stones that pilgrims kiss,
poems that take a thousand years to die
but ape the immortality of this
red label on a little butterfly.

_____________

Onionesque Reality Home >>

Read Full Post »

I am late with my echo statement on this.

Over the past few days, the internet is abuzz with news of Craig Venter and his team for creating the first fully functional cell, controlled by synthetic DNA and discussions on what might be the ethical consequences of future work in this area.

The fact that this has happened is not surprising at all. Dr Venter has been very open about his work and has been promoting it for some years now.  For instance, a couple of years ago there was a wonderful TED talk in which Venter talks about his team being close to creating synthetic life. The latest news is ofcourse not of synthetic life, but a step closer to that grand aim.

Another Instance : Two years there was a brainstorming session whose transcript was converted by EDGE into a book available for free download too.

Dimitar Sasselov, Max Brockman, Seth Lloyd, George Church, J. Craig Venter, Freeman Dyson, Image Courtesy - EDGE

The BOOK can be downloaded from here.

So from such updates, it did not surprise me much when Venter made the announcement.

____________

Ethics : There have been frenzied debates on what this might lead us to on the internet, on television and elsewhere. These discussions on ethics appear to me to be inevitable and I find it most appropriate to quote the legendary Freeman Dyson on it.

“Two Hundred years ago, William Blake engraved The Gates of Paradise, a little book of drawings and verses. One of the drawings, with the title “Aged Ignorance”, shows an old man wearing professional eyeglasses and holding a large pair of scissors. In front of him, a winged child running naked in the light from a rising sun. The old man sits with his back to the sun. With a self satisfied smile he opens his scissors and chips the child’s wings. With the picture goes a little poem :

“In Time’s Ocean falled drown’d,
In aged ignorance profound,
Holy and cold, I clip’d the Wings
Of all Sublunary Things.”

This picture is an image of the human condition in the era that is now beginning. The rising sun is biological science, throwing light of every increasing intensity onto the processes by which we live and feel and think. The winged child is human life, becoming for the first time aware of itself and its potentialities in the light of science. The old man is our existing human society, shaped by ages of past ignorance. Our laws, our loyalities, our fears and hatreds, our economic and social injustices, all grew slowly and are deeply rooted in the past. Inevitably the advance of biological knowledge will bring clashes between the old institutions and new desires for human improvement. Old institutions will clip the wings of human desire. Up to a point, caution is justified and social constraints are necessary. The new technologies will be dangerous as well as liberating. But in the long run, social constraints must bend to new realities. Humanity can not live forever with clipped wings. The vision of self-improvement which William Blake and Samuel Gompers in their different ways proclaimed, will not vanish from the Earth.”

(The above is an excerpt from a lecture given by Freeman Dyson at the Hebrew University of Jerusalem in 1995. The lecture was pulished by the New York Review of Books in 1997 and later as a chapter in Scientist as Rebel. )

Artificial Life Beyond the Wet Medium :

Life is a process which can be abstracted away from any particular mediumJohn Von Neumann

Wet Artificial-Life is what is basically synthetic life (in synthetic life you don’t really abstract the life process into another medium, but you digitize it and recreate it instead as per your requirement).

I do believe abstracting and digitizing life from a “wet chemical medium” to a computer is not very far off either i.e. a software that not only would imitate “life” but also synthesize it. And coupled with something like Koza’s Genetic Programming scheme embedded in it, develop something that possesses some intelligence other than producing more useful programs.

Coded Messages :

This is the fun part from the news about Venter and his team’s groundbreaking work. The synthetic DNA of the bacteria has a few messages coded into it.

1. “To live, to err, to fall, to triumph, to create life out of life.” – from James Joyce’s A Portrait of the Artist as a Young Man.

James Joyce is one of my favourite writers*, so I was glad that this was encoded too. But I find it funny that what this quote says can also be the undoing of synthetic life or rather a difficult problem to solve. The biggest enemy of synthetic life is evolution (creating life out of life :), evolution would ensure that control of the synthetic bacteria is lost soon enough. I believe that countering this would be the single biggest challenge in synthetic biology.

*When I tried reading Ulysses, I kept giving up. But had this compulsive need to finish it anyway. I had to join an Orkut community called “Who is afraid of James Joyce” and after some motivation could read it! ;-)

2. What I can not build, I can not understand – Richard P. Feynman

This is what Dr Venter announced, isn’t “What I can not create, I do not understand” the correct version?

Feynman's Blackboard at the time of his death: Copyright - Caltech

3. “See things not as they are, but as they might be” – J. Robert Oppenheimer from American Prometheus

____________

Recommendations :

1. What is Life – Erwin Schrodinger (PDF)

2. Life – What A Concept! – EDGE (PDF)

3. A Life Decoded : My Genome, My Life – C. J. Venter (Google Books)

____________

Onionesque Reality Home >>

Read Full Post »

I am in the process of winding up taking a basic course on Bio-Informatics, it is offered as an elective subject for final year under-graduate Information Technology students.  I preferred taking this course as a visiting faculty on weekends as managing time in the week is hard (though i did take some classes on weekdays).

______

nbt1205-1499-F1[Gene Clustering : (a) shows clusters (b) uses hierarchical clustering (c) uses k-means (d)  SOM finds clusters which are arranged in grids. Source : Nature Biotechnology 23, 1499 – 1501 (2005) by Patrick D’haeseleer]

Why Bio-Informatics?

The course (out of the ones offered in Fall) I would have preferred taking the most would have been a course on AI. There is no course on Machine Learning or Pattern Recognition at the UG level here, and the course on AI comes closest as it has sufficient weight given to Neural Nets and Bayesian Learning.

The only subject that comes nearest to my choice as AI was not available, was Bio-Informatics as about 60 percent of the syllabus was Machine Learning, Data Mining and Pattern Recognition. And it being a basic course gave me the liberty to take these parts in much more detail as compared to the other parts. And that’s exactly why taking up Bio-Informatics even though it’s not directly my area was not a bad bargain!

______

The Joys of Teaching:

This is the first time that I have formally taken a complete course, I have taken work-shops and given talks quite a few times before. But never taken a complete course.

I have always enjoyed teaching. When I say I enjoy teaching, I don’t necessarily mean something academic. I like discussing ideas in general

If I try to put down why I enjoy teaching, there might be some reasons:

  • There is an obvious inherent joy in teaching that few activities have for me. When i say teaching here, like I said before I don’t just mean to talk about formal teaching, but rather the more general meaning of the term.
  • It’s said that there is no better way to learn than to teach. Actually that was the single largest motivation that prompted me to take that offer.
  • Teaching gives me a high! The time I get to discuss what I like (and teach), I forget things that might be pressing me at other times of the day. I tend to become a space-cadet when into teaching. It’s such a wonderful experience!
  • One more reason that i think i like teaching is this : I have a wide range of reading (or atleast am interested in) and I have noticed that the best way it gets connected and in most unexpected ways is in discussions. You don’t get people who would be interested in involved discussions very often, also being an introvert means the problem is further compounded. Teaching gives me a platform to engage in such discussions. Some of the best ideas that I have got, borrowing from a number of almost unrelated areas is while discussing/teaching. And this course gave me a number of ideas that I would do something about if I get the chance and the resources.
  • Teaching also gives you the limits of your own reading and can inspire you to plug the deficiencies in your knowledge.
  • Other than that, I take teaching or explaining things as a challenge. I enjoy it when I find out that I can explain pion exchanges to friends who have not seen a science book after grade 10. Teaching is a challenge well worth taking for a number of reasons!

From this specific course the most rewarding moment was when a couple of groups approached me after the conclusion of classes to help them a little with their projects. Since their projects are of moderate difficulty and from pattern recognition, I did take that up as a compliment for sure! Though I can not say I can “help” them,  I don’t like using that word, it sounds pretentious, I would definitely like to work with them on their projects and hopefully would learn something new about the area.

______

Course:

I wouldn’t be putting up my notes for the course, but the topics I covered included:

1. Introduction to Bio-Informatics, Historical Overview, Applications, Major Databases, Data Management, Analysis and Molecular Biology.

2. Sequence Visualization, structure visualization, user interface, animation verses simulation, general purpose technologies, statistical concepts, microarrays, imperfect data, quantitative randomness, data analysis, tool selection, statistics of alignment, clustering and classification, regression analysis.

3. Data Mining Methods & Technology overview, infrastructure, pattern recognition & discovery, machine learning methods, text mining & tools, dot matrix analysis, substitution metrics, dynamic programming, word methods, Bayesian methods, multiple sequence alignment, tools for pattern matching.

4. Introduction, working with FASTA, working with BLAST, filtering and capped BLAST, FASTA & BLAST algorithms & comparison.

Like I said earlier, my focus was on dynamic programming, clustering, regression (linear, locally weighted), Logistic regression, support vector machines, Neural Nets, an overview of Bayesian Learning. And then introduced all the other aspects as applications subsequently and covered the necessary theory then!

______

Resources:

All my notes for the course were hand-made and not on \LaTeX, so it would be impossible to put them up now (they were basically made from a number of books and the MIT-OCW).

H0wever I would update this space soon enough linking to all the resources I would recommend.

______

I am looking forward to taking a course on Digital Image Processing and Labs the next semester, which begins December onwards (again as a visiting instructor)! Since Image Processing is closer to the area I am interested in deeply (Applied Machine Learning – Computer Vision), I am already very excited about the possibility!

______

Onionesque Reality Home >>

Read Full Post »

I have been following Dr Adrian Dyer’s work for about a year. I discovered him and his work while searching on the Monash University website for research on vision, ML and related fields when a friend of mine was thinking of applying for a PhD there. Dr Dyer is a vision scientist and a top expert on Bees. His present research is based on investigating aspects of insect vision and  how insects make decisions in complex colored natural environments.

His current findings (published just about a  month or so back) could have important cues on how to make improvements in computer based facial recognition systems.

Thermogram of bumblebee on flower

[Thermographic Image of a Bumblebee on a flower showing temperature difference : Image Source ]

_____

When i read about Dr Dyer’s most recent work I was instantly reminded of a rather similar experiment on pigeons done almost a decade and a half back by Watanabe et al [1] .

Digression: Van Gogh, Chagall and Pigeons

Allow me to take a brief digression and have a look at that classic experiment before returning to Bees. The experiment by Watanabe et al is my favorite when i have to initiate a general talk on WHAT is pattern recognition and generalization etc.

In the experiment, pigeons were placed in a skinner box and were trained to recognize paintings by two different artists (eg Van Gogh and Chagall). The training to recognize paintings by a particular artist was done by giving a reward on pecking when presented paintings by that artist.

pigeon_skinner_box

picture1

picture2

_____

The pigeons were trained on a marked set of paintings by Van Gogh and Chagall. When tested on the SAME set they gave a discriminative ability of 95 %.  Then they were presented with paintings by the same artists that they had not previously seen. The ability of the pigeons to discriminate was still about 85%, which is a pretty high recognition rate. Interestingly this performance wasn’t much off human performance.

What this means is that pigeons do not memorize pictures but they can extract patterns (the “style” of painting etc) and can generalize from already seen data to make predictions on data never seen before. Isn’t this really what Artificial Neural Networks and Support Vector Machines do? And there is a lot of research going on to work to improve the generalization of these systems.

_____

Coming Back to Bees:

Dr Adrian Dyer’s group has been working on the visual information processing in miniature brains, brains that are much smaller than a primate brain. And honeybees provide a good model for working on such brains. One reason being that a lot about their behavior is known.

Their work has mostly been on color information processing in bees and how it could lend cues  for improvement in vision in robotic systems. Investigations have shown that the very small Bee brain can be taught to recognize extremely difficult/ complex objects.

_____

Color Vision in Bees [2] :

Research by Dr Dyer’s group on color vision in bees has provided key insights on the relationship between the perceptive abilities of bees and the flowers that they frequent. As an example it was known that flowers had evolved particular colors to suit the visual system of bees, but it was not known how it was that certain flower colors were extremely rare in nature. The color discrimination abilities of the Bees were tested by making a virtual meadow. There was high variability in the strategies used by bees to solve problems, however the discriminative abilities of the Bees were somewhat comparable to that of the Human sensory abilities. This research has shown not only how plants evolve flower colors to attract pollinators but also on what is the nature of color vision systems in nature and what are their limitations.

_____

Recognition of Rotated Objects (Faces) [3] :

The ability to recognize 3-Dimensional objects in natural environments is a challenging problem for visual systems especially in condition of rotation of object, variation in lighting conditions, occlusion etc. The effect of rotation on an object sometimes may cause it to look more dissimilar when compared to its rotated version than a non-rotated different object. Human and primate brains are known to recognize novel orientations of rotated objects by a method of interpolating between stored views. Primate brains perform better when presented with novel views that can be interpolated to from stored views than those novel views that required extrapolation beyond the stored views. Some animals, for example pigeons perform equally well on interpolated and extrapolated views.

To test how miniature brains, such as in Bees would deal with the problem posed by rotation of objects, Dr Dyer’s group presented Bees a face recognition task.  The bees were trained for different views (0, 30 and 60 degrees) of two similar faces S1 and S2, these are enough for getting an insight on how the brain solves the problem of a view with which it has no prior experience. The Bees were trained with face images from a standard test for studying vision for human infants.

training-images1

[Image Source – 3]

Group 1 of bees was trained for 0 degree orientation and then was given a non-reward test on the same view and that of a novel 30 degree view.

Group 2 was trained for 60 degree orientation and then was tested on the same view (non-reward) and a novel 30 degree view.

Group 3 was trained for both the 0 degree and 60 degree orientations and then was tested on the same and novel 30 degree interpolation visual stimuli.

Group 4 was trained with both 0 degree and 30 degree orientations and then was tested on the same and also the novel 60 degree extrapolation view.

Bees in all of the four groups were able to recognize the trained target stimuli well above the chance values. But when presented with novel views Group 1 and Group 2 could not recognize novel views, while Group 3 could. They could do so by interpolating between the 0 degree and the 60 degree views. Group 4 too could not recognize novel views presented to it, indicating that Bees could not extrapolate from the 0 and 30 degree training to recognize faces oriented at 60 degrees.

The experiment puts forth that a miniature brain can learn to recognize complex visual stimuli by experience by a method of image interpolation and averaging. However the performance is bad when the stimuli required extrapolation.

_____

Bees performed poorly on images that were inverted. But in this regard they are in good company,  even humans do not perform well when images are inverted. As an illustration consider these two images:

inverted1-2

How similar do these two inverted images look to be at first glance? Very similar? Let’s now have a look at the upright images. The one on the left below corresponds to the image on the left above.

upright1-2

[Image Source]

The upright images look very different, even though the inverted images looked very similar.

I would recommend an exercise on rotating the image on the right to its upright position and back, please follow this link for doing so. This was a good illustration of the fact that the human visual system does not perform AS well with inverted images.

_____

The findings show that despite the constrained neural resources of the bee brain (about 0.01 percent as compared to humans) they can perform remarkably well at complex visual recognition tasks. Dr Dyer says this could lead to improved models in artificial systems as this test is evidence that face recognition does not require an extremely advanced nervous system.

The conclusion is that the Bee brain, which is small (about 850,000 neurons) can be simulated with relative ease (as compared to what was previously thought for complex pattern analysis and recognition) for performing rather complex pattern recognition tasks.

On a lighter note, Dr Dyer also says that his findings don’t justify the adage that you should not kill a bee otherwise its nest mates will remember you to take revenge. ;-)

_____

References and Recommended Papers:

[1] Van Gogh, Chagall and Pigeons: Picture Discrimination in Pigeons and Humans in “Animal Cognition”, Springer Varlag – Shigeru Watanabe.

[2] The Evolution of Flower Signals to Attract Pollinators – Adrian Dyer.

[3] Insect Brains use Image Interpolation Mechanisms to Recognise Rotated Objects in “PLos One” – Adrian Dyer, Quoc Vuong.

[4] Honeybees can recognize complex images of Natural scenes for use as Potential Landmarks – Dyer, Rosa, Reser.

_____

Onionesque Reality Home >>

Read Full Post »

About two months back I came across a series of Reith lectures given by professor Vilayanur Ramachandran, Dr Ramachandran holds a MD from Stanley Medical College and a PhD from Trinity College, Cambridge University and is presently the director of the center for Brain and cognition at the University of California at San Diego and an adjunct professor of biology at the Salk Institute. Dr Ramachandran is known for his work on behavioral neurology, which promises to greatly enhance our understanding of the human brain, which could be the key in my opinion in making “truly intelligent” machines.

vilayanur_ramachandran

[Dr VS Ramachandran: Image Source- TED]

I heard these lectures two three times and really enjoyed them and was intrigued by the cases he presents. Though these are old lectures (they were given in 2003), they are new to me and I think they are worth sharing anyway.

For those who are not aware, the Reith lectures were started by the British Broadcasting Corporation radio in 1948. Each year a person of high distinction gives these lectures. The first were given by mathematician Bertrand Russell. They were named so in the honor of the first director general of the BBC- Lord Reith. Like most other BBC presentations on science, politics and philosophy they are fantastic. Dr Ramachandran became the first from the medical profession to speak at Reith.

The 2003 series named The Emerging Mind has five lectures, each being roughly about 28-30 minutes. Each are a trademark of Dr Ramachandran with funny anecdote, witty arguments, very intersting clinical cases, the best pronunciation of “billions” since Carl Sagan, and let me not mention the way he rolls the RRRRRRRs while talking. Below I don’t intend to write what the lectures are about, I think they should be allowed to talk for themselves.

Lecture 1: Phantoms in the Brain

lecture1Listen to Lecture 1 | View Lecture Text

Lecture 2: Synapses and the Self

lecture2

Listen to Lecture 2 | View Lecture Text

Lecture 3: The Artful Brain

lecture3

Listen to Lecture 3 | View Lecture Text

Lecture 4: Purple Numbers and Sharp Cheese

lecture4

Listen to Lecture 4 | View Lecture Text

Lecture 5: Neuroscience the new Philosophy

lecture5

Listen to Lecture 5 | View Lecture Text

[Images above courtesy of the BBC]

Note: Real Player required to play the above.

As a bonus to the above I would also advice to those who have not seen this to have a look at the following TED talk.

In a wide-ranging talk, Vilayanur Ramachandran explores how brain damage can reveal the connection between the internal structures of the brain and the corresponding functions of the mind. He talks about phantom limb pain, synesthesia (when people hear color or smell sounds), and the Capgras delusion, when brain-damaged people believe their closest friends and family have been replaced with imposters.

Again he talks about curious disorders. One that he talks about in the above video, the Capgras Delusion is only one among the many he talks about in the Reith lectures. Other things that he talks about here is the origin of language and synesthesia.

Now look at the picture below and answer the following question: Which of the two figures is Kiki and which one is Bouba?

500px-booba-kikisvgIf you thought that the one with the jagged shape was Kiki and the one with the rounded one was Bouba then you belong to the majority. The exceptions need not worry.

These experiments were first conducted by the German gestalt psychologist Wolfgang Kohler and were repeated with the names “Kiki” and “Bouba” given to these shapes by VS Ramachandran and Edward Hubbard. In their experiments, they found a very strong inclination in their subjects to name the jagged shape Kiki and the rounded one Bouba. This happened with about 95-98 percent of the subjects. The experiments were repeated in Tamil speakers and then in babies of about 3 years of age. (who could not write) The results were similar. The only exceptions being in people having autistic disorders where the percentage reduced to only 60.

Dr Ramachandran and Dr Hubbard went on to suggest that this could have implications in our understanding of how language evolved as it suggests that naming of objects is not a random process as held by a number of views but depends on the appearance of the object under consideration. The strong “K” in Kiki had a direct correlation with the jagged shape of that object, thus suggesting a non-arbitrary mapping of objects with the sounds associated with them.

In the above talk and also the lectures, he talks about Synesthesia, a condition wherein the subject associates a color on seeing black and white numbers and letters with each.

His method of studying rare disorders to understand what in the brain does what is very interesting and is giving insights much needed to understand the organ that drives innovation and well, almost everything.

I highly recommend all the above lectures and the video above.

Onionesque Reality Home >>

Read Full Post »

I started writing this post on 18 June with the singular aim of posting it by 22 June. The objective of this post was to celebrate the life and ideas of Tommy Gold (May 22, 1920 – June 22, 2004) on his fourth death anniversary. But after that I did not have much access to the Internet for reasons I had posted about earlier, and so sadly I missed that date. After that I did not edit and post it as I thought there would be little point. Now I think it is okay to  post it instead of deleting it all together. A tribute to Thomas Gold would still be the aim though I regret I could not post in time.

[Image Source]

Quoting Thomas Gold (Source):

New ideas in science are not always right just because they are new. Nor are the old ideas always wrong just because they are old. A critical attitude is clearly required of every scientist. But what is required is to be equally critical to the old ideas as to the new. Whenever the established ideas are accepted uncritically, but conflicting new evidence is brushed aside and not reported because it does not fit, then that particular science is in deep trouble – and it has happened quite often in the historical past. If we look over the history of science, there are very long periods when the uncritical acceptance of the established ideas was a real hindrance to the pursuit of the new. Our period is not going to be all that different in that respect, I regret to say.

This paragraph reminds me of a post on Gaping Void, a blog that I just discovered two days back on the fantastic Reasonable Deviations. The post, titled Good Ideas Have Lonely Childhoods is highly recommended to read, as a vast majority of good ideas are heretical and this post is on a heretic. Infact this post on Gaping Void prompted me to publish this forgotten draft!

Thomas Gold was a true renaissance man, a brilliant polymath and a controversial figure who Freeman Dyson has described as a modern heretic. Gold was born as an Austrian and was educated in Switzerland and the UK, Initially he worked with Hermann Bondi and Fred Hoyle and then later accepted an appointment with the prestigious Cornell University and remained there till his death.

Gold portrays the typical rebel scientist, with a penchant for controversy and working against general and strongly held theories. Gold worked across a large number of fields- Cosmology, Biophysics, Astrophysics, Geophysics, Space Engineering etc. Throughout his career Gold never cared about being wrong or of the opposition. He had this knack of turning out to be right. He however was not afraid to be wrong, infact he has been very famously wrong two times and he took both times in good humor. Such was his intellect that he never cared of any opposition and his ideas have always been very interesting. I hope to chronicle some of his major ideas here.

Coming back, as I said he has been famously wrong two times:

1. First was the steady state theory. Gold along with Fred Hoyle and Hermann Bondi developed and published the steady state theory of the universe in 1948. The three thought that it was impossible to think that all of matter could be created out of an initial singularity. The theory proposed that new matter is created continuously and this accounts for the constant density of the expanding universe. Though this seems to have violated the first law of thermodynamics the steady state had a number of supporters in the 50s and the 60s but the discovery of the cosmic background radiation which basically is a remnant of the big bang or explosion was the first major blow to it and over time its wide acceptance declined to only a very few cosmologists like Jayant V. Narlikar, who very recently have proposed alternatives and modifications to the original idea of steady state like the quasi steady state. However whatever said and done, the competition between the Big Bang and the Steady State spurred a lot of research which ultimately has helped us understand the cosmos better as good competition always does.

2. His second major incorrect idea was proposed in 1955, when he said that moon’s surface was covered with a fine rock powder that is electro-statically supported. He later said that astronauts would sink as soon as they landed on the moon. His theory influenced the design of the American Surveyor lunar landing probes to a very large extent. But their precautions were excessive and most of the fears were unfounded, though when the Apollo 11 crew bought back soil samples from the moon, it was indeed powdery though nowhere close to the extent Gold had proposed it to be. However a lot of astronomers credit a lot of development in planetology in subsequent years to Gold’s initial work and ideas on the lunar regolith.

[The famous photo of the footprint on the Lunar Surface: The Lunar soil was powdery as predicted by Gold but nowhere to the extent he had thought so. Image Source : Wikipedia Commons]

On both the occasions Gold took “defeat” in good humor, the trademark of a good scientist is that he is never afraid to be wrong. He once remarked:

Science is no fun, if you are never wrong!

In choosing a hypothesis there is no virtue in timidity and no shame in sometimes being wrong.

The second quote is not supposed to be humorous by the way.

On most occasions however, Thomas Gold had this knack of turning out to be right inspite of facing intense criticism initially. Some of his heretical ideas that turned out right were:

1. Pitch Discriminative Ability of the Ear: One of the first of Tommy Gold’s ideas that was received with much hostility and was summarily rejected by the experts of the time was his theory and experiments on hearing and pitch discrimination. In 1946 immediately after the great war, Gold got interested in the ability of the human ear to discriminate the pitch of musical sounds. It was a question that was perplexing the auditory physiologists of the time, and Gold fresh from working with the royal navy on radars and communications thought of the physiology of hearing in those terms. The human ear can tell the difference when a pure tone changes by as little as one percent. Gold thought that the ear contained a set of resonators finely tuned, whereas the prevailing view of the time was that the internal structure of the ear was too weak and flabby to resonate and all the interpretation of the sounds and tones happened in the brain, with the information being communicated by neural signals.

Gold designed a very simple and elegant experiment to prove the experts, the professional auditory physiologists wrong. The experiment has been described by Freeman Dyson in his book, The Scientist as Rebel as he himself was a part of the experiment. Prof Freeman writes:

He (Gold) fed into the headphones a signal consisting of short pulses of a pure tone, separated by intervals of silence. The silent intervals were atleast ten times as long as the period of the pure tone. The pulses were all of the same shape, but they had phases that could be reversed independently….Sometimes Gold gave all the pulses the same phase and some times he alternated the phases so that the even pulses had one phase and the odd pulses had the opposite phase. All I had to do was to sit with the headphones on my ears and listen while Gold put in the signals with either constant or alternating phases. I had to tell him from the sound whether the phase was constant or alternating. When the silent intervals between pulses was ten times the period of the pure tone, it was easy to tell the difference. I heard a noise like a mosquito, a hum and a buzz sounding together, and the quality of the hum changed noticeably when the phases were changed from constant to alternating. We repeated the trials with longer silent intervals. I could still tell the difference, when the silent interval was as long as thiry periods.

This elegant experiment showed that the human ear could remember the phase of a signal after it has stopped for thirty times the period of the signal and proved that pitch discrimination was done not in the brain but in the ear. To be able to remember the phase, the ear should have finely tuned resonators that continue to vibrate during the period of silence.

Now armed with experimental evidence for his theory that pitch discrimination was done in the ear, Gold also had a theory on how there could be very finely tuned resonators made up of the weak and flabby material in the ear. He proposed that the ear involved an active – not a passive – receiver, one in which positive feedback, not just passive detection is involved. He said that the ear had an electrical feedback system, the mechanical resonators are coupled to the electrically powered sensors so that the overall system works like an active tuned amplifier. The positive feedback would counteract the dissipation taking place in the flabby internal structure of the ear.

Gold’s findings and ideas were rejected by the experts of the field, who said Gold was an ignorant outsider with absolutely no knowledge or training in physiology. Gold however always maintained he was right. Thirty years later, auditory physiologists armed with more sophisticated tools discovered that Gold was indeed correct. The electrical sensors and the feedback system in the ear were identified.

Gold’s two papers on hearing published in 1948 remain highly cited to this day.

2. Pulsars: One of his ideas that was rather quickly accepted was his idea on what a Pulsar was. After being discovered by radio astronomers Gold proposed that they were rotation neutron stars.

[A schematic of a Pulsar. Image Source: Wikipedia Commons]

After some initial disapproval this idea was accepted almost immediately by the “experts”. Gold himself has written this on this matter in an article authored by him titled The Inertia of Scientific Thought:

Shortly after the discovery of pulsars I wished to present an interpretation of what pulsars were, at this first pulsar conference: namely that they were rotating neutron stars. The chief organiser of this conference said to me, “Tommy, if I allow for that crazy an interpretation, there is no limit to what I would have to allow”. I was not allowed five minutes floor time, although I in fact spoke from the floor. A few months later, this same organiser started a paper with the sentence, “It is now generally considered that pulsars are rotating neutron stars”.

3. The Arrow of Time: In the 60s Gold wrote extensively on The Arrow of Time, and held the view that the universe will re collapse someday and that the arrow of time will reverse. His views remain controversial till today and a vast majority of cosmologists don’t even take it seriously. It remains to be seen if Gold’s hypothesis would be respected.

4. Polar Wandering: In the 1950s while at the royal observatory, Gold became interested in the instability of Earth’s axis of rotation or the wandering pole. He wrote a number of papers on plasmas and magentic fields in the solar system and also coined the term “The Earth’s Magnetosphere”. In 1955 he published yet another revolutionary paper “Instability of the Earth’s Axis of Rotation“. Gold made the view that large scale polar wandering could be expected to occur in relatively short geological time spans. That is, he expressed the possibility that the Earth’s axis of rotation could migrate by 90 degrees in a time of under a million years. This effectively means that in such a case, points at the equator would come to the poles and points at the poles would come at the equator. Gold argued that this 90 degree migration would be triggered by movements of mass that would cause the old axis of rotation to become unstable. A large accumulation of ice at the poles for example might be one reason why such a flip could occur. His paper was ignored largely for over 40-45 years, largely because at that time the research was focused on plate tectonics and continental drift.

In 1997 a Caltech professor Joseph Kirschvink, who is an expert in these areas published a paper that suggested that such a 90 degree flip indeed happened at least once in the past in the early Cambrian era. This holds much significance given the fact that this large scale migration of the poles coincides with the so called “Cambrian Explosion“. Gold’s work was finally confirmed after being ignored for decades.

5. Abiogenic Origin of Petroleum: When I first read about the theory of abiogenic origin of petroleum promoted by Tommy Gold and many Soviet and Ukrainian Geologists, I was immediately reminded of my old organic chemistry texts that spoke of the abiogenic origin theory given by Mendeleev almost 150 years ago. This was called Mendeleev’s Carbide Theory and it died after the biological theory of petroleum origin was widely accepted.

Speaking as a layman who has little knowledge of geology, petroleum etc, I would say any theory of petroleum origin must broadly explain the following points:

1. Its association with Brine.

2. Presence of N and S compounds.

3. Presence of biomarkers, chlorophyll and haemin in it.

4. It’s optically active nature.

According to Mendeleev’s Carbide theory:

1. The molten metals in the Earth’s interior combined with carbon from coal deposits to form the corresponding carbides.

  • Ca + 2C ---> Ca C_2
  • Mg + 2C---> Mg C_2
  • 4Al + 3C---> Al_4 C_3

2. The carbides reacted with steam or water under high temperature and pressure to form a mixture of saturated and unsaturated hydrocarbons.

  • Ca C_2 + 2H_2 O---> Ca(OH)_2 + C_@ H_2
  • Al_4 C_3 +12H_2 O---> 4Al(OH)_3 +3C H_4

3. The unsaturated hydrocarbons underwent a series of reactions such as hydrogenation, isomerisation, polymerisation and alkylation to form a number of hydrocarbons.

  • C_2 H_2 ---> C_2 H_4 ---> C_2 H_6
  • 3[C_2 H_2]---> C_6 H_6

etc.

This theory got the support by the work of Moissan and Sabatier and Senderen. Moissan obtained a petroleum like liquid by the hydrogenation of Uranium Carbide, Sabatier and Senderen obtained a petroleum type substance by the hydrogenation of Acetylene.

However the theory was in time replaced by the theory of biological origin as it failed to account for:

1. The presence of Nitrogen and Sulphur compounds.

2. Presence of Haemin and Chlorophyll.

3. Optically active nature.

After almost hundred years, the abiogenic theory was resurrected by the great Russian geologist Nikolai Alexandrovitch Kudryavtse in 1951. This was worked on extensively by a number of Russians in the coming two decades.

In the west Thomas Gold was the only major proponent of it. And this is his most controversial theory, not only because it was opposed by powerful oil industry lobbyists but also because Gold faced much flak for plagiarism, something that Gold refused to acknowledge, in his later works he cited the works of the Russian scientists in the field. He maintained that he was simply not aware of the work done by the Soviet Geologists and that he cited their work once he became aware of it. Gold proposed that the natural gas and the oil came from reservoirs from deep within the Earth and are simply relics of the formation of the Earth. And that the biological molecules found in them did not show they had a biological origin but rather that they were contaminated by living creatures. He remained critical of the proponents of the theory of biological origin as then it could not be explained why there were hydrocarbon reserves on other planets when there had been no life on them. This theory remains controversial, Gold could not live to defend it. However an elegant experiment performed provides some evidence that Gold could indeed again be right.

Dyson wrote the following on an EDGE essay in this regard:

Just a few weeks before he died, some chemists at the Carnegie Institution in Washington did a beautiful experiment in a diamond anvil cell, [Scott et al., 2004]. They mixed together tiny quantities of three things that we know exist in the mantle of the earth, and observed them at the pressure and temperature appropriate to the mantle about two hundred kilometers down. The three things were calcium carbonate which is sedimentary rock, iron oxide which is a component of igneous rock, and water. These three things are certainly present when a slab of subducted ocean floor descends from a deep ocean trench into the mantle. The experiment showed that they react quickly to produce lots of methane, which is natural gas. Knowing the result of the experiment, we can be sure that big quantities of natural gas exist in the mantle two hundred kilometers down. We do not know how much of this natural gas pushes its way up through cracks and channels in the overlying rock to form the shallow reservoirs of natural gas that we are now burning. If the gas moves up rapidly enough, it will arrive intact in the cooler regions where the reservoirs are found. If it moves too slowly through the hot region, the methane may be reconverted to carbonate rock and water. The Carnegie Institute experiment shows that there is at least a possibility that Tommy Gold was right and the natural gas reservoirs are fed from deep below. The chemists sent an E-mail to Tommy Gold to tell him their result, and got back a message that he had died three days earlier.

6. The Deep Hot Biosphere: I am yet to read this book, though I have been thinking of reading it for almost a year now.

[The Deep Hot Biosphere, Image Source : Amazon]

In this controversial but famous theory Gold proposes that the entire crust of the Earth uptill a depth of a few miles is populated by living creatures. The biosphere that we see is only a very small part of it. The most ancient part of it is much larger and is much warmer. In 1992 Gold referred to ocean vents that pump bacteria from the depth of the Earth in support of his views. A number of such hydrothermal vents have since then been discovered. There is increasing evidence that his yet another controversial theory might just be right. Even if it is not, the evidence collected will help us understand our planet much better.

[A Black Smoker Hydrothermal Vent]

Finally Quoting Prof Freeman Dyson on him again:

Gold’s theories are always original, always important, usually controversial, and usually right.

References and Recommended Reads:

1. The Scientist as Rebel : Chapter 3 – Freeman Dyson (Amazon)

2. The Inertia of Scientific Thought – Thomas Gold

3. The Deep Hot Biosphere – Thomas Gold

4. Heretical Thoughts about Science and Society – Freeman Dyson

Onionesque Reality Home >>

Read Full Post »

Older Posts »