A mildly personal post.
The title does not imply that the lines quoted below correspond to the exact origin of Kolmogorov Complexity, though they are related and give away the essence.
Information theory must precede probability theory and not be based on it. By the very essence of this discipline, the foundations of information theory have a finite combinatorial character.
With my background in Electrical Engineering I had the opportunity to take courses in Information Theory and Coding which made the idea of Shannon’s Information Theory quite familiar. But there was a time when I had enough background to started noticing conversations that were perhaps relegated to the background before. Simply because I didn’t know enough to make any sense of them and hence these conversations were more or less noise to me. But these happened to be on Kolmogorov Complexity. I hadn’t sat down and studied it. But had been reading articles here and there that mentioned it even with ideas such as The Godel Incompleteness theorems and the Halting Problem. It created the impression that this area must be fundamental but not clearly why.
And then I came across the above rather cryptic lines by Kolmogorov. Used to the idea of entropy (defined in terms of probability) as information, they made my brain hurt. I spent a couple of days thinking about them and suddenly I realized WHY it was so fundamental. And things started making more sense. Ofcourse I didn’t know anything about it as such, but the two day thinking session convinced me enough, that in a sense it was as fundamental as calculus for me given the things I was interested in (along with Shannon‘s and Fisher’s ideas). It also convinced me enough to want to know more about it no matter what projects I was involved in and immediately bought a book that I have been trying my way through as an aside to what I have been working on (linked below).
I find such insightful one liners that happen to cause almost a phase transition or a complete change in the way you look at some thing (information theory in this case) quite remarkable, making the new view very beautiful. Ofcourse there is a “right” time for them to occur but this was certainly one of them. The lines below had an auxiliary effect too:
The applications of probability theory can be put on a uniform basis. It is always a matter of consequences of hypotheses about the impossibility of reducing in one way or another the complexity of the descriptions of the objects in question. Naturally this approach to the matter does not prevent the development of probability theory as a branch of mathematics being a special case of general measure theory.
The concepts of information theory as applied to infinite sequences give rise to very interesting investigations, which, without being indispensable as a basis of probability theory, can acquire a certain value in investigation of the algorithmic side of mathematics as a whole.
– Andrey Kolmogorov (1983)
While the above was a more personal story there are many other famous examples of cryptic one liners changing a view. Here’s a famous one:
A Famous Cryptic Comment:
I remember reading a story about the great mathematician and electrical engineer Robert Fano. Around the same time the father of Cybernetics, Norbert Wiener was at MIT and was famous at the time for wandering around campus and then talking to anybody about anything that caught his fancy. There are stories on how graduate students would run away when Wiener was sighted coming to save their time. Wiener’s eccentricities are famous (recommendation  below) but let me not digress. In one of his routine days he appeared in the office of Fano and made a cryptic comment:
You know, information is entropy.
Fano spent a good time thinking about what this might mean and he has himself remarked that it was in part responsible for his developing, completely independently the first law of Shannon’s theory. Claude Shannon even cited Fano in his famous paper.
I can’t help thinking that such one liners are perhaps the best examples of information compression and Kolmogorov complexity.
1. An Introduction to Kolmogorov Complexity and its Applications – Ming Li and Paul Vitanyi (on the basis of the first third)
2. Dark Hero of the Information Age – Conway and Siegelman
Earlier Related Post:
1. Ray Solomonoff is No More (has a short discussion on Solomonoff’s ideas in the same. It is noteworthy that Solomonoff published the first paper in what is today called Kolmogorov Complexity. His approach to the area was through Induction. Kolmogorov and Chaitin approached it from randomness).