Artificial Intelligence | Onionesque Reality

Posts Tagged ‘Artificial Intelligence’

Ray Solomonoff is No More

Posted in Artificial Intelligence, Computer Science, Machine Learning, Scientists, Thinkers, Innovators and Artists, tagged Algorithmic Information Theory, Artificial Intelligence, Dartmouth Conference, Induction, Inductive Inference, Inference, Kolmogorov Complexity, Machine Learning, Ray Solomonoff, Theoretical Computer Science, Universal Distribution on December 20, 2009| 6 Comments »

For the past couple of years, I have had a couple of questions about machine learning research that I have wanted to ask some experts, but never got the chance to do so. I did not even know if my questions made sense at all. I would probably write about them on a blog post soon enough.

It is however ironical that I came to know my questions were valid and well discussed (I never knew what to search for them, I used expressions not used by researchers) only by the death of Ray Solomonoff (he was one researcher who worked on it, and an obituary on him highlighted his work on it, something I missed). Solomonoff was one of the founding fathers of Artificial Intelligence as a field and Machine Learning as a discipline within it. It must be noted that he was one of the few attendees at the 1956 Dartmouth Conference, basically an extended brain storming session that started AI formally as a field. The other attendees were : Marvin Minsky, John McCarthy, Allen Newell, Herbert Simon, Arthur Samuel, Oliver Selfridge, Claude Shannon, Nathaniel Rochester and Trenchand Moore. His 1950-52 papers on networks are regarded as the first statistical analysis of the same. Solomonoff was thus a towering figure in AI and Machine Learning.

[Ray Solomonoff : (25 July 1926- 7 December 2009)]

Solomonoff is widely considered as the father of Machine Learning for circulating the first report on the same in 1956. His particular focus was on the use of probability and its relation to learning. He founded the idea of Algorithmic Probability (ALP) in a 1960 paper at Caltech, an idea that gives rise to Kolmogorov Complexity as a side product. A. N. Kolmogorov independently discovered similar results and came to know of and acknowledged Solomonoff’s earlier work on Algorithmic Information Theory. His work however was relatively unknown in the west than in the soviet union, which is why Algorithmic Information Theory is mostly referred to as Kolmogorov complexity rather than “Solomonoff Complexity”. Kolmogorov and Solomonoff approached the same framework from different directions. While Kolmogorov was concerned with randomness and Information Theory, Solomonoff was concerned with inductive reasoning. And in doing so he discovered ALP and Kolomogorov Complexity years before anyone did. I would write below on only one aspect of his work that I have studied to some degree in the past year.

Solomonoff With G.J Chaitin, another pioneer in Algorithmic Information Theory

[Image Source]

The Universal Distribution:

His 1956 paper, “An inductive inference machine” was one of the seminal papers that used probability in Machine Learning. He outlined two main problems which he thought (correctly) were linked.

The Problem of Learning in Humans : How do you use all the information that you gather in life in making decisions?

The Problem of Probability : Given you have some data and some a-priori information, how can you make the best possible predictions for the future?

The problem of learning is more general and related to the problem of probability. Solomonoff noted that the Machine learning was simply the process of approximating ideal probabilistic predictions for practical use.

Building on his 1956 paper, he discovered probabilistic languages for induction at a time when it was considered out of fashion. And discovered the Universal Distribution.

All induction problems could be basically reduced to this form : Given a sequence of binary symbols how do you extrapolate it? The answer being that we could assign a probability to a sequence and then use Bayes Theorem to make a prediction on which particular continuation of a string was how likely. That gives rise to an even more difficult question that was the basic question for a lot of Solomonoff’s work and on Algorithmic Probability/Algorithmic Information Theory. This question is : How do you assign probabilities to strings?

Solomonoff approached this problem using the idea of a Universal Turing Machine. Suppose this Turing Machine has three types of tapes, an unidirectional input tape, an unidirectional output tape and a bidirectional working tape. Suppose this machine will take some binary string as input and it may give a binary string as output.

It could do any of the following :

1. Print out a string after a while and then come to a stop.

2. It could print an infinite output string.

3. It could go in an infinite loop for computing the string and not output anything at all (Halting Problem).

For a string $x$ , the ALP would be as follows :

If we feed some bits at random to our Turing Machine, there will always be some probability that the output would start with a string $x$ . This probability is the algorithmic or universal probability of the string $x$ .

The ALP would be given as :

$\displaystyle P_M(x) = \sum_{i=0}^{\infty}2^{-\lvert\ S_i(x)\rvert}$

Where $P_M(x)$ would be the universal probability of string $x$ with respect to the universal turing machine $M$ . To understand the placement of $S_i(x)$ in the above expression, let’s discuss it a little.

There could be many random strings that after being processed by the Turing Machine give an output string that begins with string $x$ . And $S_i(x)$ is the $i^{th}$ such string. Each such string carries a description of $x$ . And since we want to consider all cases, we take the summation. In the expression above $\lvert\ S_i(x)\rvert$ gives the length of a string and $2^{\lvert\ S_i(x)\rvert}$ the probability that the random input $S_i$ would output a string starting with $x$ .

This definition of ALP has the following properties that have been stated and proved by Solomonoff in the 60s and the 70s.

1. It assigns higher probabilities to strings with shorter descriptions. This is the reverse of something like Huffman Coding.

2. The value for ALP would be independent of the type of the universal machine used.

3. ALP is incomputible. This is the case because of the halting problem. Infact it is this reason that it has not received much attention. Why get interested in a model that is incomputible? However Solomonoff insisted that approximations to the ALP would be much better than existing systems and that getting the exact ALP is not even needed.

4. $P_M(x)$ is a complete description for $x$ . That means any pattern in the data could be found by using $P_M$ . This means that the universal distribution is the only inductive principle that is complete. And approximations to it would be much desirable.

__________

Solomonoff also worked on Grammar discovery and was very interested in Koza’s Genetic Programming system, which he believed could lead to efficient and much better machine learning methods. He published papers till the ripe old age of 83 and is definitely inspiring for the love of his work. Paul Vitanyi notes that :

It is unusual to find a productive major scientist that is not regularly employed at all. But from all the elder people (not only scientists) I know, Ray Solomonoff was the happiest, the most inquisitive, and the most satisfied. He continued publishing papers right up to his death at 83.

Solomonoff’s ideas are still not exploited to their full potential and in my opinion would be necessary to explore to build the Machine Learning dream of never-ending learners and incremental + synergistic Machine Learning. I would write about this in a later post pretty soon. It was a life of great distinction and a life well lived. I also wish strength and peace to his wife Grace and his nephew Alex.

The five surviving (in 2006) founders of AI who met in 2006 to commemorate 50 years of the Dartmouth Conference. From left : Trenchand Moore, John McCarthy, Marvin Minsky, Oliver Selfridge and Ray Solomonoff

__________

Refernces and Links:

1. Ray Solomonoff’s publications.

2. Obituary: Ray Solomonoff – The founding father of Algorithmic Information Theory by Paul Vitanyi

3. The Universal Distribution and Machine Learning (PDF).

4. Universal Artificial Intelligence by Marcus Hutter (videolectures.net)

5. Minimum Description Length by Peter Grünwald (videolecures.net)

6. Universal Learning Algorithms and Optimal Search (Neural Information Processing Systems 2002 workshop)

__________

Onionesque Reality Home >>

Read Full Post »

Some Interesting Courses: Neural Nets, Visualization, ML and Astrophysical Chemistry

Posted in Artificial Intelligence, Chemistry, Computer Science, Game Theory, Machine Learning, Neuroscience, Physics, Space, Video Lectures, tagged Artificial Intelligence, Artificial Neural Networks, Astrophysical Chemistry, Bayesian Networks, Chemistry, Computational Neuroscience, Dr Harry Kroto, Game Theory, H. Sebastian Seung, Harvard University, Machine Learning, MIT, Multi-Agent Systems, Neural Networks, NLP, Reinforcement Learning, University Frieburg, Video Lectures, Visualization on July 14, 2009| Leave a Comment »

Here are a number of interesting courses, two of which I am looking at for the past two weeks and that i would hopefully finish by the end of August-September.

Introduction to Neural Networks (MIT):

These days, amongst the other things that I have at hand including a project on content based image retrieval. I have been making it a point to look at a MIT course on Neural Networks. And needless to say, I am getting to learn loads.

I would like to emphasize that though I have implemented a signature verification system using Neural Nets, I am by no means good with them. I can be classified a beginner. The tool that I am more comfortable with are Support Vector Machines.

I have been wanting to know more about them for some years now, but I never really got the time or you can say the opportunity. Now that I can invest some time, I am glad I came across this course. So far I have been able to look at 7 lectures and I should say that I am MORE than very happy with the course. I think it is very detailed and extremely well suited for the beginner as well as the expert.

The instructor is H. Sebastian Seung who is the professor of computational neuroscience at the MIT.

The course has 25 lectures each one packed with a great amount of information. Meaning, the lectures might work slow for those who are not very familiar with this stuff.

The video lectures can be accessed over here. I must admit that i am a little disappointed that these lectures are not available on you-tube. That’s because the downloads are rather large in size. But I found them worth it any way.

The lectures cover the following:

Lecture 1: Classical neurodynamics
Lecture 2: Linear threshold neuron
Lecture 3: Multilayer perceptrons
Lecture 4: Convolutional networks and vision
Lecture 5: Amplification and attenuation
Lecture 6: Lateral inhibition in the retina
Lecture 7: Linear recurrent networks
Lecture 8: Nonlinear global inhibition
Lecture 9: Permitted and forbidden sets
Lecture 10: Lateral excitation and inhibition
Lecture 11: Objectives and optimization
Lecture 12: Excitatory-inhibitory networks
Lecture 13: Associative memory I
Lecture 14: Associative memory II
Lecture 15: Vector quantization and competitive learning
Lecture 16: Principal component analysis
Lecture 17: Models of neural development
Lecture 18: Independent component analysis
Lecture 19: Nonnegative matrix factorization. Delta rule.
Lecture 20: Backpropagation I
Lecture 21: Backpropagation II
Lecture 22: Contrastive Hebbian learning
Lecture 23: Reinforcement Learning I
Lecture 24: Reinforcement Learning II
Lecture 25: Review session

The good thing is that I have formally studied most of the stuff after lecture 13 , but going by the quality of lectures so far (first 7), I would not mind seeing them again.

Quick Links:

Course Home Page.

Course Video Lectures.

Prof H. Sebastian Seung’s Homepage.

_____

Visualization:

This is a Harvard course. I don’t know when I’ll get the time to have a look at this course, but it sure looks extremely interesting. And I am sure a number of people would be interested in having a look at it. It looks like a course that be covered up pretty quickly actually.

[Image Source]

The course description says the following:

The amount and complexity of information produced in science, engineering, business, and everyday human activity is increasing at staggering rates. The goal of this course is to expose you to visual representation methods and techniques that increase the understanding of complex data. Good visualizations not only present a visual interpretation of data, but do so by improving comprehension, communication, and decision making.

In this course you will learn how the human visual system processes and perceives images, good design practices for visualization, tools for visualization of data from a variety of fields, collecting data from web sites with Python, and programming of interactive visualization applications using Processing.

The topics covered are:

Data and Image Models
Visual Perception & Cognitive Principles
Color Encoding
Design Principles of Effective Visualizations
Interaction
Graphs & Charts
Trees and Networks
Maps & Google Earth
Higher-dimensional Data
Unstructured Text and Document Collections
Images and Video
Scientific Visualization
Medical Visualization
Social Visualization
Visualization & The Arts

Quick Links:

Course Home Page.

Course Syllabus.

Lectures, Slides and other materials.

Video Lectures

_____

Advanced AI Techniques:

This is one course that I would be looking at some parts of after I have covered the course on Neural Nets. I am yet to glance at the first lecture or the materials, so i can not say how they would be like. But I sure am expecting a lot from them going by the topics they are covering.

The topics covered in a broad sense are:

Bayesian Networks
Statistical NLP
Reinforcement Learning
Bayes Filtering
Distributed AI and Multi-Agent systems
An Introduction to Game Theory

Quick Link:

Course Home.

_____

Astrophysical Chemistry:

I don’t know if I would be able to squeeze in time for these. But because of my amateurish interest in chemistry (If I were not an electrical engineer, I would have been into Chemistry), and because I have very high regard for Dr Harry Kroto (who is delivering them) I would try and make it a point to have a look at them. I think I’ll skip gym for some days to have a look at them. ;-)

[Nobel Laureate Harry Kroto with a Bucky-Ball model – Image Source : richarddawkins.net]

Quick Links:

Dr Harold Kroto’s Homepage.

Astrophysical Chemistry Lectures

_____

Onionesque Reality Home >>

Read Full Post »

The Best Machine Learning Course on the Web

Posted in Machine Learning, Video Lectures, tagged Andrew Ng, Artificial Intelligence, Machine Learning, Stanford University, Tutorials, Video Lectures on September 11, 2008| 5 Comments »

Edit (November 05, 2011): Note that this post was made over THREE years ago. That time this was the only comprehensive Machine Learning course available online. Since then situation has changed. Professor Andrew Ng’s course has been offered online for everyone. Many other courses have also become available. Find some of these at the bottom of this post.

Just two weeks ago I posted a few lectures on Machine Learning, Learning Theory, Kernel Methods etc on this post. Since then my friend and guide Sumedh Kulkarni informed me of a new course on the Stanford University youtube channel on Machine Learning I have also indexed this channel on my post on Video Lectures.

Since then I have already seen half of it, and though it covers a very broad range, and is meant to be a first course on Machine Learning, it is in my opinion the best course on the web on the same. Most others I find boring because of either poor English of the instructor or bad recording or both.

The course is taken by Dr Andrew Ng, who has very good experience in teaching this course and working in Robotics, AI and Machine Learning in general. Incidentally he has been the guide of a PhD candidate Ashutosh Saxena, whose research papers we have used for a previous project on pattern recognition.

Dr Ng’s deep knowledge in the field can be felt in just some minutes into the first course which he makes even more interesting by his good communication skills and ability to make lectures more exciting and intuitive by adding fun videos in between.

The course details are as follows.

Course: Machine Learning (CS 229). It can be accessed over here: Stanford Machine Learning.

Instructor: Dr Andrew Ng.

Course Overview:

Lecture 1: Overview of the broad field of Machine Learning.

Lecture 2: Linear regression, Gradient descent, and normal equations and discussion on how they relate to machine learning.

Lecture 3: Locally weighted regression, Probabilistic interpretation and Logistic regression.

Lecture 4: Newton’s method, exponential families, and generalized linear models.

Lecture 5: Generative learning algorithms and Gaussian discriminative analysis.

Lecture 6: Applications of naive Bayes, neural networks, and support vector machine.

Lecture 7: Optimal margin classifiers, KKT conditions, and SUM duals.

Lecture 8: Support vector machines, including soft margin optimization and kernels.

Lecture 9: Learning theory, covering bias, variance, empirical risk minimization, union bound and Hoeffding’s inequalities.

Lecture 10: VC dimension and model selection.

Lecture 11: Bayesian statistics, regularization, digression-online learning, and the applications of machine learning algorithms.

Lecture 12: Unsupervised learning in the context of clustering, Jensen’s inequality, mixture of Gaussians, and expectation-maximization.

Lecture 13: Expectation-maximization in the context of the mixture of Gaussian and naive Bayes models, as well as factor analysis and digression.

Lecture 14: Factor analysis and expectation-maximization steps, and continues on to discuss principal component analysis (PCA).

Lecture 15: Principal component analysis (PCA) and independent component analysis (ICA) in relation to unsupervised machine learning.

Lecture 16: Reinforcement learning, focusing particularly on MDPs, value functions, and policy and value iteration.

Lecture 17: Reinforcement learning, focusing particularly on continuous state MDPs, discretization, and policy and value iterations.

Lecture 18: State action rewards, linear dynamical systems in the context of linear quadratic regulation, models, and the Riccati equation, and finite horizon MDPs.

Lecture 19: Debugging process, linear quadratic regulation, Kalmer filters, and linear quadratic Gaussian in the context of reinforcement learning.

Lecture 20: POMDPs, policy search, and Pegasus in the context of reinforcement learning.

Course Notes: CS 229 Machine Learning.

My gratitude to Stanford and Prof Andrew Ng for providing this wonderful course to the general public.

—

Other Machine Learning Video Courses:

1. Tom Mitchell’s Machine Learning Course.

Related Posts:

1. Demystifying Support Vector Machines for Beginners. (Papers, Tutorials on Learning Theory, Machine Learing)

2. Video Lectures.

Onionesque Reality Home >>

Read Full Post »

Demystifying Support Vector Machines for Beginners: A Reading List

Posted in Artificial Intelligence, Machine Learning, Video Lectures, tagged Artificial Intelligence, Computer Science, Data Mining, Generalization Theory, Kernel Based Learning, Kernel Methods, Learning Theory, Machine Learning, Optimization, Pattern Recognition, Support Vector Machines, Tutorials, Video Lectures on August 31, 2008| 30 Comments »

Though I have worked on pattern recognition in the past I have always wanted to work with Neural Networks for the same. However for some reason or the other I could never do so, I could not even take it as an elective subject due to some constraints. Over the last two years or so I have been promising myself and ordering myself to stick to a schedule and study ANNs properly, however due to a combination of procrastination, over-work and bad planning I have never been able to do anything with them.

However I have now got the opportunity to work with Support Vector Machines and over the past some time I have been reading extensively on the same and have been trying to get playing with them. Now that the actual implementation and work is set to start I am pretty excited to work with them. It is nice that I get to work with SVMs though I could not with ANNs.

Support Vector Machine is a classifier derived from statistical learning theory by Vladimir Vapnik and his co-workers. The foundations for the same were laid by him as late as the 1970s SVM shot to prominence when using pixel maps as input it gave an accuracy comparable with sophisticated Neural Networks with elaborate features in a handwriting recognition task.

Traditionally Neural Networks based approaches have suffered some serious drawbacks, especially with generalization, producing models that can overfit the data. SVMs embodies the structural risk minimization principle that is shown superior to the empirical risk minimization that neural networks use. This difference gives SVMs the greater ability to generalize.

However learning how to work with SVMs can be challenging and somewhat intimidating at first. When i started reading on the topic I took the books by Vapnik on the subject but could not make much head or tail. I could only attain a certain degree of understanding, nothing more. To specialize in something I do well when I start off as a generalist, having a good and quite correct idea of what is exactly going on. Knowing in general what is to be done and what is what, after this initial know-how makes me comfortable I reach the stage of starting with the mathematics which gives profound understanding as anything without mathematics is meaningless. However most books that I came across missed the first point for me, and it was very difficult to make a headstart. There was a book which I could read in two days that helped me get that general picture quite well. I would highly recommend it for most who are in the process of starting with SVMs.The book is titled Support Vector Machines and other Kernel Based Learning methods and is authored by Nello Cristianini and John-Shawe Taylor.

I would highly recommend people who are starting with Support Vector Machines to buy this book. It can be obtained easily over Amazon.

This book has very less of a Mathematical treatment but it makes clear the ideas involved and this introduces a person studying from it to think more clearly before he/she can refine his/her understanding by reading something heavier mathematically. Another that I would highly recommend is the book Support Vector Machines for Pattern Classification by Shigeo Abe.

Another book that I highly recommend is Learning with Kernels by Bernhard Scholkopf and Alexander Smola. Perfect book for beginners.

Only after one has covered the required stuff from here that I would suggest Vapnik’s books which then would work wonderfully well.

Other than the books there are a number of Video Lectures and tutorials on the Internet that can work as well!

Below is a listing of a large number of good tutorials on the topic. I don’t intend to flood a person interested in starting with too much information, where ever possible i have described what the document carries so that one could decide what should suffice for him/her on the basis of need. Also I have star-marked some of the posts. This marks the ones that i have seen and studied from personally and found them most helpful and i am sure they would work the same way with both beginners and people with reasonable experience alike.

—

Webcasts/ Video Lectures on Learning Theory, Support Vector Machines and related ideas:

EDIT: For those interested. I had posted about a course on Machine Learning that has been provided by Stanford university. It too is suited for an introduction to Support Vector Machines. Please find the post here. Also this comment might be helpful, suggestions to it according to your learning journey are also welcome.

1. *Machine Learning Workshop, University of California at Berkeley. This series covers most of the basics required. Beginners can skip the sessions on Bayesian models and Manifold Learning.

Workshop Outline:

Session 1: Classification.

Session 2: Regression.

Session 3: Feature Selection

Session 4: Diagnostics

Session 5: Clustering

Session 6: Graphical Models

Session 7: Linear Dimensionality Reduction

Session 8: Manifold Learning and Visualization

Session 9: Structured Classification

Session 10: Reinforcement Learning

Session 11: Non-Parametric Bayesian Models

2. Washington University. Beginners might be interested on the sole talk on the topic of Supervised Learning for Computer Vision Applications or maybe in the talk on Dimensionality Reduction.

3. Reinforcement Learning, Universitat Freiburg.

4. Deep Learning Workshop. Good talks, But I’d say these are meant for only the highly interested.

5. *Introduction to Learning Theory, Olivier Bousquet.

This tutorial focuses on the “larger picture” than on mathematical proofs, it is not restricted to statistical learning theory however. The course comprises of five lectures and is quite good to watch. The Frenchman is both smart and fun!

6. *Statistical Learning Theory, Olivier Bousquet. This course gives a detailed introduction to Learning Theory with a focus on the Classification problem.

Course Outline:

Probabilistic and Concentration inequalities, Union Bounds, Chaining, Measuring the size of a function class, Vapnik Chervonenkis Dimension, Shattering Dimensions and Rademacher averages, Classification with real valued functions.

7. *Statistical Learning Theory, Olivier Bousquet. This is not the repeat of the above course. This one is a more recent lecture series than the above actually. This course has six lectures. Another excellent set.

Course Outline:

Learning Theory: Foundations and Goals

Learning Bounds: Ingredients and Results

Implications: What to conclude from bounds

7. Advanced Statistical Learning Theory, Olivier Bousquet. This set of lectures compliment the above courses on statistical learning theory and give a more detailed exposition of the current advancements in the same.This course has three lectures.

Course Outline:

PAC Bayesian bounds: a simple derivation, comparison with Rademacher averages, Local Rademacher complexity with classification loss, Talagrand’s inequality. Tsybakov noise conditions, Properties of loss functions for classification (influence on approximation and estimation, relationship with noise conditions), Applications to SVM – Estimation and approximation properties, role of eigenvalues of the Gram matrix.

8. *Statistical Learning Theory, John-Shawe Taylor, University of London. One plus point of this course is that is has some good English. Don’t miss this lecture as it has been given by the same professor whose book we just discussed.

9. *Learning with Kernels, Bernhard Scholkopf.

This course covers the basics for Support Vector Machines and related Kernel methods. This course has six lectures.

Course Outline:

Kernel and Feature Spaces, Large Margin Classification, Basic Ideas of Learning Theory, Support Vector Machines, Other Kernel Algorithms.

10. Kernel Methods, Alexander Smola, Australian National University. This is an advanced course as compared to the above and covers exponential families, density estimation, and conditional estimators such as Gaussian Process classification, regression, and conditional random fields, Moment matching techniques in Hilbert space that can be used to design two-sample tests and independence tests in statistics.

11. *Introduction to Kernel Methods, Bernhard Scholkopf, There are four parts to this course.

Course Outline:

Kernels and Feature Space, Large Margin Classification, Basic Ideas of Learning Theory, Support Vector Machines, Examples of Other Kernel Algorithms.

12. Introduction to Kernel Methods, Partha Niyogi.

13. Introduction to Kernel Methods, Mikhail Belkin, Ohio State University.This lecture is second in part to the above.

14. *Kernel Methods in Statistical Learning, John-Shawe Taylor.

15. *Support Vector Machines, Chih-Jen Lin, National Taiwan University. Easily one of the best talks on SVM. Almost like a run-down tutorial.

Course Outline:

Basic concepts for Support Vector Machines, training and optimization procedures of SVM, Classification and SVM regression.

16. *Kernel Methods and Support Vector Machines, Alexander Smola. A comprehensive six lecture course.

Course Outline:

Introduction of the main ideas of statistical learning theory, Support Vector Machines, Kernel Feature Spaces, An overview of the applications of Kernel Methods.

—

Additional Courses:

1. Basics of Probability and Statistics for Machine Learning, Mikaela Keller.

This course covers most of the basics that would be required for the above courses. However sometimes the shooting quality is a little shady. This talk seems to be the most popular on the video lectures site, one major reason in my opinion is that the lady delivering the lecture is quite pretty!

2. Some Mathematical Tools for Machine Learning, Chris Burges.

3. Machine Learning Laboratory, S.V.N Vishwanathan.

4. Machine Learning Laboratory, Chrisfried Webers.

—

Introductory Tutorials (PDF/PS):

1. *Support Vector Machines with Applications (Statistical Science). Click here >>

2. *Support Vector Machines (Marti Hearst, UC Berkeley). Click Here >>

3. *Support Vector Machines- Hype or Hallelujah (K. P. Bennett, RPI). Click Here >>

4. Support Vector Machines and Pattern Recognition (Georgia Tech). Click Here >>

5. An Introduction to Support Vector Machines in Data Mining (Georgia Tech). Click Here >>

6. University of Wisconsin at Madison CS 769 (Zhu). Click Here >>

7. Generalized Support Vector Machines (Mangasarian, University of Wisconsin at Madison). Click Here >>

8. *A Practical Guide to Support Vector Classification (Hsu, Chang, Lin, Via U-Michigan Ann Arbor). Click Here >>

9. *A Tutorial on Support Vector Machines for Pattern Recognition (Christopher J.C Burges, Bell Labs Lucent Technologies, Data mining and knowledge Discovery). Click Here >>

10. Support Vector Clustering (Hur, Horn, Siegelmann, Journal of Machine Learning Research. Via MIT). Click Here >>

11. *What is a Support Vector Machine (Noble, MIT). Click Here >>

12. Notes on PCA, Regularization, Sparisty and Support Vector Machines (Poggio, Girosi, MIT Dept of Brain and Cognitive Sciences). Click Here >>

13. *CS 229 Lecture Notes on Support Vector Machines (Andrew Ng, Stanford University). Click Here >>

—

Introductory Slides (mostly lecture slides):

1. Support Vector Machines in Machine Learning (Arizona State University). Click here >>

Lecture Outline:

What is Machine Learning, Solving the Quadratic Programs, Three very different approaches, Comparison on medium and large sets.

2. Support Vector Machines (Arizona State University). Click Here >>

Lecture Outline:

The Learning Problem, What do we know about test data, The capacity of a classifier, Shattering, The Hyperplane Classifier, The Kernel Trick, Quadratic Programming.

3. Support Vector Machines, Linear Case (Jieping Ye, Arizona State University). Click Here >>

Lecture Outline:

Linear Classifiers, Maximum Margin Classifier, SVM for Separable data, SVM for non-separable data.

4. Support Vector Machines, Non Linear Case (Jieping Ye, Arizona State University). Click Here >>

Lecture Outline:

Non Linear SVM using basis functions, Non-Linear SVMs using Kernels, SVMs for Multi-class Classification, SVM path, SVM for unbalanced data.

5. Support Vector Machines (Sue Ann Hong, Carnegie Mellon). Click Here >>

6. Support Vector Machines (Carnegie Mellon University Machine Learning 10701/15781). Click Here >>

7. Support Vector Machines and Kernel Methods (CMU). Click Here >>

8. SVM Tutorail (Columbia University). Click Here >>

9. Support Vector Machines (Via U-Maryland at College Park). Click Here >>

10. Support Vector Machines: Algorithms and Applications (MIT OCW). Click Here >>

11. Support Vector Machines (MIT OCW). Click Here >>

—

Papers/Notes on some basic related ideas (No estoric research papers here):

1. Robust Feature Induction for Support Vector Machines (Arizona State University). Click Here >>

2. Hidden Markov Support Vector Machines (Brown University). Click Here >>

3. *Training Data Set for Support Vector Machines (Brown University). Click Here >>

4. Support Vector Machines are Universally Consistent (Journal Of Complexity). Click Here >>

5. Feature Selection for Classification of Variable Length Multi-Attribute Motions (Li, Khan, Prabhakaran). Click Here >>

6. Selecting Data for Fast Support Vector Machine Training (Wang, Neskovic, Cooper). Click Here >>

7. *Normalization in Support Vector Machines (Caltech). Click Here >>

8. The Support Vector Decomposition Machine (Periera, Gordon, Carnegie Mellon). Click Here >>

9. Semi-Supervised Support Vector Machines (Bennett, Demiriz, RPI). Click Here >>

10. Supervised Clustering with Support Vector Machines (Finley, Joachims, Cornell University). Click Here >>

11. Metric Learning: A Support Vector Approach (Cornell University). Click Here >>

12. Training Linear SVMs in Linear Time (Joachims, Cornell Unversity). Click Here >>

13. *Rule Extraction from Linear Support Vector Machines (Fung, Sandilya, Rao, Siemens Medical Solutions). Click Here >>

14. Support Vector Machines, Reproducing Kernel Hilbert Spaces and Randomizeed GACV (Wahba, University of Wisconsian at Madison). Click Here >>

15. The Mathematics of Learning: Dealing with Data (Poggio, Girosi, AI Lab, MIT). Click Here >>

16. Training Invariant Support Vector Machines (Decoste, Scholkopf, Machine Learning). Click Here >>

—

*As I have already mentioned above, the star marked courses/lectures/tutorials/papers are the ones that I have seen and studied from personally (and hence can vouch for) and these in my opinion should work best for beginners.

Onionesque Reality Home >>

Read Full Post »

Kafka to Red Ant: A Strange Metamorphosis

Posted in Artificial Intelligence, Artificial Life, Image Processing, Science, Signal Processing, Swarm Intelligence, Thinkers, Innovators and Artists, tagged Artificial Intelligence, Artificial Life, Franz Kafka, Image Processing, Image Segmentation, Pattern Recognition, Science, Swarm Intelligence, Vitorino Ramos, W on June 26, 2008| 4 Comments »

Before I make a start I would want to make it very clear that inspite of what that the title may suggest, this is not a “sensational” post. It is just something that really intrigued me. It basically falls under the domain of image segmentation and pattern recognition, however it is something that can intrigue a person with a non-scientific background with a like (or dislike) for Franz Kafka’s work equally. I keep the title because it is the title of an original work by Dr Vitorino Ramos and hence making changes to it is not a good thing.

Note: For people who are not interested in technical details can skip those parts and only read the stuff in bold there.

Franz Kafka is one writer whose works have had a profound impact on me in terms that they disturbed me each time I thought about them. No, not because of his writings per se ONLY but for a greater part because i had read a lot on his rather tragic life and i saw a heart breaking reflection in his works of what happened in his life (i see a lot of similarities between Kafka’s life and that of Premchand albeit that Premchand’s work got published in his lifetime mostly, though he got true critical acclaim after his death). Yes i do think that his writings give a good picture of Europe at that time, on human needs and behavior, but the prior reason outweighs all these. Kafka remains one of my favorite writers, though his works are basically short stories. He mostly wrote on a theme that emphasized the alienation of man and the indifferent society. Kafka’s tormenting thoughts on dehumanization, the cruel world, bureaucratic labyrinths which he experienced as being part of the not so liked Jewish minority in Prague, his experiences in jobs he did, his love life and affairs, on a constant fear of mental and physical collapse as a result of clinical depression and the ill health that he suffered from, reflected in a lot of his works. Including in his novella The Metamorphosis.

W. H Auden rightly wrote about Kafka:

“Kafka is important to us because his predicament is the predicament of the modern man”

In metamorphosis the protagonist Gregor Samsa turns into a giant insect when he wakes up one morning. It is kind of apparent that the “transformation” was meant in a metaphorical sense by Kafka and not in a literal one, mostly based on his fears and his own life experiences. The Novella starts like this. . .

As Gregor Samsa awoke one morning from uneasy dreams he found himself transformed in his bed into a monstrous vermin.

—

While rummaging through a few scientific papers that explored the problem of pattern recognition using a distributed approach i came across a few by Dr Ramos et al, which dealt with the issue using the artificial colonies approach.

In the previous post i had mentioned that the self organization of neurons into a brain like structure and the self organization of an ant colony were similar in more than a few ways. If it may be implemented then it could have implications in pattern recognition problems, where the perceptive abilities emerge from the local and simple interactions of simple agents. Such decentralized systems, a part of the swarm intelligence paradigm look very promising in applying to pattern recognition and the specific case of image segmentation as basically these may be considered a clustering and combinatorial problem taking the image itself as an ant colony habit.

The basis for this post was laid down in the previous post on colony cognitive maps. We observed the evolution of a pheromonal field there and a simple model for the same:

[Evolution of a distribution of (artificial) ants over time: Image Source]

Click to Enlarge

The above is the evolution of the distribution of artificial ants in a square lattice, this work has been extended to digital image lattices by Ramos et al. Image segmentation is an image processing problem wherein the regions of the image under consideration may be partitioned into different regions. Like into areas of low contrast and areas of high contrast, on basis of texture and grey level and so on. Image segmentation is very important as the output of an image segmentation process may be used as an input in object recognition based scenarios. The work of Ramos et al (In references below) and some of the papers cited in his works have really intrigued me and i would strongly suggest readers to have a look at them if at all they are interested in image segmentation, pattern recognition and self organization in general, some might also be interested in implementing something similar too!

—

In one of the papers a swarm of artificial ants was thrown on a digital habitat (an image of Albert Einstein) to explore it for 1000 iterations. The Einstein image is replaced by a map image. The evolution of the colony cognitive maps for the Einstein image habitat is shown below for various iterations.

[Evolution of a pheromonal field on an Einstein image habitat for t= 0, 1, 100, 110, 120, 130, 150, 200, 300, 400, 500, 800, 900, 1000: Image Source]

The above is represented most aptly in a .gif image.

[Evolution of a pheromonal field on an Einstein habitat: Image Source]

Now instead of Einstein a Kafka image was taken and was subject to the same. Image Source

The Kafka image habitat is replaced by a red ant in the second row. The abstract of the paper by the same name goes as.

Created with an Artificial Ant Colony, that uses images as Habitats, being sensible to their gray levels. At the second row, Kafka is replaced as a substrate, by Red Ant. In black, the higher levels of pheromone (a chemical evaporative sugar substance used by swarms on their orientation trough out the trails). It’s exactly this artificial evaporation and the computational ant collective group synergy reallocating their upgrades of pheromone at interesting places, that allows for the emergence of adaptation and “perception” of new images. Only some of the 6000 iterations processed are represented. The system does not have any type of hierarchy, and ants communicate only in indirect forms, through out the successive alteration that they found on the Habitat.

Now what intrigues me is that the transition is extremely rapid. Such perceptive ability with change in the image habitat could have massive implications at how we look at pattern recognition for such cases.

Extremely intriguing!

—

Resources on Franz Kafka:

1. A Brief Biography

2. The Metamorphosis At Project Gutenberg. Click here >>

3. The Kafka Project

References and STRONGLY recommended papers:

1. Artificial Ant Colonies in Digital Image Habitats – A Mass behavior Effect Study on Pattern Recognition. Vitorino Ramos and Filipe Almeida. Click Here >>

2. Social Cognitive Maps, Swarms Collective Perception and Distributed Search on Dynamic Landscapes. Vitorino Ramos, Carlos Fernandes, Agostinho C. Rosa. Click Here >>

3. Self-Regulated Artificial Ant Colonies on Digital Image Habitats. Carlos Fernandes, Vitorino Ramos, Agostinho C. Rosa. Click Here >>

4. On the Implicit and the Artificial – Morphogenesis and Emergent Aesthetics in Autonomous Collective Systems. Vitorino Ramos. Click Here >>

5. A Strange Metamorphosis [Kafka to Red Ant], Vitorino Ramos.

Onionesque Reality Home >>

Read Full Post »

Colony Cognitive Maps

Posted in Artificial Intelligence, Artificial Life, Cognitive Science, Complexity, Neuroscience, Swarm Intelligence, tagged Artificial Intelligence, Brain, Cognitive Maps, Cognitive Science, Complex Adaptive Systems, Complexity, Markov Process, Neural Networks, Neuroscience, Self-Organization, Social Insects, Swarm Intelligence on June 24, 2008| 2 Comments »

Some posts back, i posted on Non-Human Art or Swarm Paintings, there I mentioned that those paintings were NOT random but were a Colony Cognitive Map.

This post will serve as the conceptual basis for the Swarm Paintings post, the next post and a few future posts on image segmentation.

Motivation: Some might wonder what is the point of writing about such a topic. And that it is totally unrelated to what i write about generally. No! That is not the case. Most of the stuff I write about is related in some sense. Well the motivation for reading thoroughly about this (and writing) maybe condensed into the following:

1. The idea of a colony cognitive map is used in SI/A-life experiments, areas that really interest me.

2. Understanding the idea of colony cognitive maps gives a much better understanding of the inherent self organization in insect swarms and gives a lead to understand self organization in general.

3. The parallel to colony cognitive maps, the cognitive maps follow from cognitive science and brain science. Again areas that really interest me as they hold the key for the REAL artificial intelligence evolution and development in the future.

—

The term “Colony Cognitive Map” as i had pointed earlier is in a way a parallel to a Cognitive Map in brain science (i use the term brain science for a combination of fields like neuroscience, Behavioral psychology, cognitive sciences and the likes and will use it in this meaning in this post ) and also that the name is inspired from the same!

There is more than just a romantic resemblance between the self-organization of “simple” neurons into an intelligent brain like structure, producing behaviors well beyond the capabilities of an individual neuron and the self-organization of simple and un-intelligent insects into complex swarms and producing intelligent and very complex and also aesthetically pleasing behavior! I have written previously on such intelligent mass behavior. Consider another example, neurons are known to transmit neurotransmitters in the same way a social insect colony is marked by pheromone deposition and laying.

[Self Organization in Neurons (Left) and a bird swarm(Below). Photo Credit >> Here and Here]

First let us try to revisit what swarm intelligence roughly is (yes i still am to write a post on a mathematical definition of the same!), Swarm Intelligence is basically a property of a system where the collective actions of unsophisticated agents, acting locally causes functional and sophisticated global patterns to emerge. Swarm intelligence gives a scheme to explore decentralized problem solving. An example that is also one of my favorites is that of a bird swarm, wherein the collective behaviors of birds each of which is very simple causes very complex global patterns to emerge. Over which I have written previously, don’t forget to look at the beautiful video there if you have not done so already!

Self Organization in the Brain: Over the last two months or so i had been reading Douglas Hofstadter’s magnum opus, Gödel, Escher, Bach: an Eternal Golden Braid (GEB). This great book makes a reference to the self organization in the brain and its comparison with the behavior of the ant colonies and the self organization in them as early as 1979.

[Photo Source: Wikipedia Commons]

A brain is often regarded as one of the most if not the most complex entity. However if we look at a rock it is very complex too, but then what makes a brain so special? What distinguishes the brain from something like a rock is the purposeful arrangement of all the elements in it. The massive parallelism and self organization that is observed in it too amongst other things makes it special. Research in Cybernetics in the 1950s and 1960s lead the “cyberneticians” to try to explain the complex reactions and actions of the brain without any external instruction in terms of self organization. Out of these investigations the idea of neural networks grew out (1943 – ), which are basically very simplified models of how neurons interact in our brains. Unlike the conventional approaches in AI there is no centralized control over a neural network. All the neurons are connected to each other in some way or the other but just like the case in an ant colony none is in control. However together they make possible very complex behaviors. Each neuron works on a simple principle. And combinations of many neurons can lead to complex behavior, an example believed to be due to self-organization. In order to help the animal survive in the environment the brain should be in tune with it too. One way the brain does it is by constantly learning and making predictions on that basis. Which means a constant change and evolution of connections.

Cognitive Maps: The concept of space and how humans perceive it has been a topic that has undergone a lot of discussion in academia and philosophy. A cognitive map is often called a mental map, a mind map, cognitive model etc.

The origin of the term Cognitive Map is largely attributed to Edward Chace Tolman, here cognition refers to mental models that people use to perceive, understand and react to seemingly complex information. To understand what a mental model means it would be favorable to consider an example I came across on wikipedia on the same. A mental model is an inherent explanation in somebody’s thought process on how something works in the spatial or external world in general. It is hypothesized that once a mental model for something or some representation is formed in the brain it can replace careful analysis and careful decision making to reduce the cognitive load. Coming back to the example consider a mental model in a person of perceiving the snake as dangerous. A person who holds this model will likely rapidly retreat as if is like a reflex without initial conscious logical analysis. And somebody who does not hold such a model might not react in the same way.

Extending this idea we can look at cognitive maps as a method to structure, organize and store spatial information in the brain which can reduce the cognitive load using mental models and and enhance quick learning and recall of information.

In a new locality for example, human way-finding involves recognition and appreciation of common representations of information such as maps, signs and images so to say. The human brain tries to integrate and connect this information into a representation which is consistent with the environment and is a sort of a “map”. Such spatial (not necessarily spatial) internal representations formed in the brain can be called a cognitive map. As the familiarity of a person with an area increases then the reliance on these external representations of information gradually reduces. And the common landmarks become a tool to localize within a cognitive map.

Cognitive maps store conscious perceptions of the sense of position and direction and also the subconscious automatic interconnections formed as a result of acquiring spatial information while traveling through the environment. Thus they (cognitive maps) help to determine the position of a person, the positioning of objects and places and the idea of how to get from one place unto another. Thus a cognitive map may also be said to be an internal cognitive collage.

Though metaphorically similar the idea of a cognitive map is not really similar to a cartographic map.

Colony Cognitive Maps: With the above general background it would be much easier to think of a colony cognitive map. As it is basically a analogy to the above. As described in my post on adaptive routing, social insects such as ants construct trails and networks of regular traffic via a process of pheromone deposition, positive feedback and amplification by the trail following. These are very similar to cognitive maps. However one obvious difference lies in the fact that cognitive maps lie inside the brain and social insects such as ants write their spatial memories in the external environment.

Let us try to picture this in terms of ants, i HAVE written about how a colony cognitive map is formed in this post without mentioning the term.

A rather indispensable aspect of such mass communication as in insect swarms is Stigmergy. Stigmergy refers to communication indirectly, by using markers such as pheromones in ants. Two distinct types of stigmergy are observed. One is called sematectonic stigmergy, it involves a change in the physical environment characteristics.An example of sematectonic stigmergy is nest building wherein an ant observes a structure developing and adds its ball of mud to the top of it. Another form of stigmergy is sign-based and hence indirect. Here something is deposited in the environment that makes no direct contribution to the task being undertaken but is used to influence the subsequent behavior that is task related. Sign based stigmergy is very highly developed in ants. Ants use chemicals called as pheromones to develop a very sophisticated signaling system. Ants foraging for food lay down some pheromone which marks the path that they follow. An isolated ant moves at random but an ant encountering a previously laid trail will detect it and decide to follow it with a high probability and thereby reinforce it with a further quantity of pheromone. Since the pheromone will evaporate the lesser used paths will gradually vanish. We see that this is a collective behavior.

Now we assume that in an environment the actors (say for example ants) emit pheromone at a set rate. Also there is a constant rate at which the pheromone evaporates. We also assume that the ants themselves have no memory of previous paths taken and act ONLY on the basis of the local interactions with pheromone concentrations in the vicinity. Now if we consider the “field” or “map” that is the overall result and formed in the environment as a result of the movements of the individual ants over a fixed period of time. Then this “pheromonal” field contains information about past movements and decisions of the individual ants.

The pheromonal field (cognitive map) as i just mentioned contains information about past movements and decisions of the organisms, but not arbitrarily far in the past since the field “forgets” its distant history due to evaporation in time. Now this is exactly a parallel to a cognitive map, with the difference that for a colony the spatial information is written in the environment unlike inside the brain in the case of a human cognitive map. Another major similarity is that neurons release a number of neurotransmitters which can be considered to be a parallel to the pheromones released as described above! The similarities are striking!

Now if i look back at the post on swarm paintings, then we can see that the we can make such paintings, with the help of a swarm of robots. More pheromone concentration on a path means more paint. And hence the painting is NOT random but is EMERGENT. I hope i could make the idea clear.

How Swarms Build Colony Cognitive Maps: Now it would be worthwhile to look at a simple model of how ants construct cognitive maps, that I read about in a wonderful paper by Mark Millonas and Dante Chialvo. Though i have already mentioned, I’ll still sum up the basic assumptions.

Assumptions:

1. The individual agent (or ant) is memoryless.

2. There is no direct communication between the organisms.

3. There is no spatial diffusion of the pheromone deposited. It remains fixed at a point where it was deposited.

4. Each agent emits pheromone at a constant rate say $\eta$ .

Stochastic Transition Probabilities:

Now the state of each agent can be described by a phase variable which contains its position $r$ and orientation $\theta$ . Since the response at any given time is dependent solely on the present and not the previous history, it would be sufficient to specify the transition probability from one location $(r,\theta)$ to another place and orientation $(r',\theta')$ an instant later. Thus the movement of each individual agent can be considered roughly to be a continuous markov process whose probabilities at each and every instance of time are decided by the pheromone concentration $\sigma(x, t)$ .

By using theoretical considerations, generalizations from observations in ant colonies the response function can be effectively summed up into a two parameter pheromone weight function.

$\displaystyle W(\sigma) = (1 + \frac{\sigma}{1 + \delta\varsigma})$

This weight function measures the relative probabilities in moving to a site $r$ with the pheromone density $\sigma(r)$ .

Another parameter $\beta$ may be considered. This parameter measures the degree of randomness by which an agent can follow a pheromone trail. For low values of $\beta$ the pheromone concentration does not largely impact its choice but higher values do.

At this point we can define another factor $\displaystyle\frac{1}{\varsigma}$ . This signifies the sensory capability. It describes the fact that the ants ability to sense pheromone decreases somewhat at higher concentrations. Something like a saturation scenario.

Pheromone Evolution: It is essential to describe how the pheromone evolves. According to an assumption already made, each agent emits pheromone at a constant rate $\eta$ with no spatial diffusion. If the pheromone at a location is not replenished then it will gradually evaporate. The pheromonal field so formed does contain a memory of the past movements of the agents in space, however because of the evaporation process it does not have a very distant memory.

Analysis: Another important parameter is the regarding the number of ants present, the density of ants $\rho_0$ . Thus using all these parameters we can define a single parameter, the average pheromonal field $\displaystyle\sigma_0 = \frac{\rho_0 \eta}{\kappa}$ . Where $\displaystyle \kappa$ is what i mentioned above, the rate of scent decay.

Further detailed analysis can be studied out here. With the above background it is just a matter of understanding.

[Evolution of distribution of ants : Source]

Click to Enlarge

Now after continuing with the mathematical analysis in the hyperlink above, we fix the values of the parameters.

Then a large number of ants are placed at random positions, the movement of each ant is determined by the probability $P_{ik}$ .

Another assumption is that the pheromone density at each point at $t=0$ is zero. Each ant deposits pheromone at a decided rate $\eta$ and also the pheromone evaporates at a fixed rate $\kappa$ .

In the above beautiful picture we the evolution of a distribution of ants on a 32×32 lattice. A pattern begins to emerge as early as the 100th time step. Weak pheromonal paths are completely evaporated and we finally get a emergent ant distribution pattern as shown in the final image.

The Conclusion that Chialvo and Millonas note is that scent following of the very fundamental type described above (assumptions) is sufficient to produce an evolution (emergence) of complex pattern of organized flow of social insect traffic all by itself. Detailed conclusion can be read in this wonderful paper!

References and Suggested for Further Reading:

1. Cognitive Maps, click here >>

2. Remembrance of places past: A History of Theories of Space. click here >>

3. The Science of Self Organization and Adaptivity, Francis Heylighen, Free University of Brussels, Belgium. Click here >>

4. The Hippocampus as a Cognitive Map, John O’ Keefe and Lynn Nadel, Clarendon Press, Oxford. To access the pdf version of this book click here >>

5. The Self-Organization in the Brain, Christoph von der Malsburg, Depts for Computer Science, Biology and Physics, University of Southern California.

5. How Swarms Build Cognitive Maps, Dante R. Chialvo and Mark M. Millonas, The Santa Fe Institute of Complexity. Click here >>

6. Social Cognitive Maps, Swarm Collective Perception and Distributed Search on Dynamic Landscapes, Vitorino Ramos, Carlos Fernandes, Agostinho C. Rosa.

Related Posts:

1. Swarm Paintings: Non-Human Art

2. The Working of a Bird Swarm

3. Adaptive Routing taking Cues from Stigmergy in Ants

Possibly Related:

Gödel, Escher, Bach: A Mental Space Odyssey

Read Full Post »

Swarm Paintings: Non-Human Art

Posted in Art, Artificial Intelligence, Artificial Life, Biology, Complexity, Robotics, Swarm Intelligence, tagged A-Life, Ants, Art, Artificial Intelligence, Artificial Life, Biology, Complexity, Emergence, Non Human Art, Swarm Intelligence on May 13, 2008| 10 Comments »

General Background: Since childhood i have enjoyed sketching and painting, and very much at that! Sometimes i found myself copying an existing image or painting, making small changes here and there in it. Yes, the paintings came out beautiful (or so i think!), but one thing always made me unhappy, i thought that the creativity needed to make original stuff was missing at times (not always). It was not there all the time. It came in bursts and went away.

I agree with Leonel Moura (from his article) that creativity is basically produced due to different experiences and interactions. Absence or lack of which could make art lose novelty.

Talking of novelty, how about looking at art in nature? Richard Dawkins states that the difference between human art or design and the amazingly “ingenious” forms that we encounter in nature, is due tho the fact that Human art originates in the mind , while the natural designs result from natural selection. Which is very true. However it is another matter that natural selection and cultural selection, that will ultimately decide on the “popularity” of an art don’t function in the same way. Anyhow How can we remove the cultural bias or the human bias that we have in our art forms?

Answers in Artificial Life: Artificial life may be defined as “A field of study devoted to understanding life by attempting to derive general theories underlying biological phenomena, and recreating these dynamics in other physical media – such as computers – making them accessible to new kinds of experimental manipulation and testing. This scientific research links biology and computer science.”

Most of the A-Life simulations today can not be considered truly alive, as they still can not show some properties of truly alive systems and also that they have considerable human bias in design. However there are two views that have existed on the whole idea of Artificial Life and the extent it can go.

Weak A-Life is the idea that the “living process” can not be achieved beyond a chemical domain. Weak A-life researchers concentrate on simulating life processes with an underlying aim to understand the biological processes.

Strong A-Life is exactly the reverse. John Von Neumann once remarked “life is a process which can be abstracted away from any particular medium“. In recent times Ecologist Tom Ray declared that his computer simulation Tierra was not a simulation of life but a synthesis of life. In Tierra, computer programmes compete for CPU time and access to the main memory. These programs are also evolvable, can replicate, mutate and recombine.

Relating A-Life to Art: While researching on these ideas and the fact that these could be used to generate the art forms that i talked about in the first paragraph i came across a few papers by Swarm Intelligence Guru Vitorino Ramos and a couple of articles by Leonel Moura who had worked in collaboration with Dr Ramos on precisly this theme.

Swarm Paintings: Thus the idea as i had mentioned in my very first paragraph is to create an organism ideally with minimum pre-commitment to any representational art scheme or human style or taste. Sounds simple but is not so simple to implement!

There are a number of projects that have dealt with creating art, but these mostly have been evolutionary algorithms that learn from human behavior, and learn about human mannerisms and try to create art according to that. The idea here is to create art with a minimum of human intervention.

I came across a project by Dr Vitorino Ramos to which i had pointed out implicitly in the last paragraph. This project called ARTSBOT (ARTistic Swarm roBOTs) project. This project tries to address this issue of minimizing the human intervention in aesthetics , ethnicity, taste,style etc. In short their idea was to remove or to minimize the anthropocentric bias that pervades all our art forms. Obviously all this can have massive implications in our understanding of the biological processes also, however here we’ll talk of only art.

Two of the first paintings that emerged were:

(Source: Here)

These paintings were among the first swarm paintings by Leonel Moura and Vitorino Ramos. Now we see that these seem detached from a functional human pre-commitment. They don’t seem to represent any emotion or style or taste. However they still look very pleasant!

However the point to be understood and to be noted is that these are NOT random pictures created either by a programme or by a swarm of robots moving “randomly”. These pictures were generated by a horde of artificial ants and also by robots. They are not random, but they EMERGE from a process of pheromone deposition and evaporation as was simulated in this system from ants. Thus the result that we have above is a Colony Cognitive Map. The colony cognitive map is analogous to a cognitive map in the brain. I will cover the idea of a colony cognitive map in the next post.

A couple of more beautiful paintings can be seen below!

(Source for both images : Here>>)

Though i have already mentioned how these art forms emerge, i would still like to quote a paragraph from here:

The painting robots are artificial ‘organisms’ able to create their own art forms. They are equipped with environmental awareness and a small brain that runs algorithms based on simple rules. The resulting paintings are not predetermined, emerging rather from the combined effects of randomness and stigmergy, that is, indirect communication trough the environment.
Although the robots are autonomous they depend on a symbiotic relationship with human partners Not only in terms of starting and ending the procedure, but also and more deeply in the fact that the final configuration of each painting is the result of a certain gestalt fired in the brain of the human viewer. Therefore what we can consider ‘art’ here, is the result of multiple agents, some human, some artificial, immerged in a chaotic process where no one is in control and whose output is impossible to determine.
Hence, a ‘new kind of art’ represents the introduction of the complexity paradigm in the cultural and artistic realm.’

A Painting bot is something like in the picture shown below:

A swarm of robots at work:

The final art generated by the swarm of these robots is beautiful!

(Photo Credit for the three pictures above: Here>>

Conclusions:

The work of Dr Ramos and Leonel Maura can be summed up as:

1. The human is only the “art-architect”, the “swarm” is the artist.

2. The “life” of Artificial Life shows characteristics like natural life itself namely Morphogenesis, ability to adapt to changing environments, evolution etc.

Leonel Moura’s wonderful article states that the final aim is to create an “Artificial Autopoietic System”, intriguing indeed and eagerly awaited!!
Such simulations could change the way we understand the biological processes and life.

Also i am now thinking how could music be produced based on the same or similar ideas. I wonder if Swarm music could be available. It would be most interesting and i can’t wait to listen to it already!

Have a look at this video by Leonel Moura, having some time lapse footage of robots painting.

References:

1. Ant- Swarm Morphogenese By Leonel Moura

2. On the Implicit and on the Artificial – Morphogenesis and Emergent Aesthetics in Autonomous Collective Systems, in ARCHITOPIA Book, Art, Architecture and Science, INSTITUT D’ART CONTEMPORAIN, J.L. Maubant et al. (Eds.), pp. 25-57, Chapter 2, Vitorino Ramos.

3. A Strange Metamorphosis [From Kafka to Red Ant], Vitorino Ramos

Links:

Follow the following links to follow on more exciting papers and paintings.

1. Dr Vitorino Ramos

2. Leonel Moura

Onionesque Reality Home >>

Read Full Post »

Zero Crossings as an Effective Feature In Speech Recognition for Embedded Applications

Posted in Artificial Intelligence, Embedded Systems, Signal Processing, tagged AI, Artificial Intelligence, dynamic time warping, Embedded Systems, feature extraction, Signal Processing, speech recognition on March 20, 2008| Leave a Comment »

Motivation: About a couple of months back i was wondering on designing a speaker dependent speech recognizer on the 8051 micro-controller or any of its derivatives for simple machine control. We would of course need an isolated word (or digit) recognizer.

Problem: Speech recognizers can be implemented using Hidden Markov Models or Artificial Neural Networks. There are plenty of such systems in place. However the problem with these algorithms is that they are computationally pretty intensive, and thus can not be implemented on a simple 8 bit fixed point micro-processor, and that is what we need for simple machine control applications. So there is a need for a simpler algorithm.

All these algorithms also employ a short term feature vector to take care of the non-stationary nature of speech. Generally the vector length is so chosen that the nature of the signal in this band is quasi-stationary. Feature vectors are an area of active research. Generally however at the university level, Mel Frequency Cepstrum Coefficients (MFCC) or Linear Predictive Coefficients are taken as features. These too require computations that are beyond the scope of a simple processor/ micro controller like the 8051.

Solution: I was thinking what could be done to reduce this burden and choose a simpler feature so that it could be implemented on 8051. While researching on this i came across a paper[1]. This papers deals with this problem exactly!

The researchers have used only zero crossings of the speech signal to determine the feature vector. Since this novel feature extraction method is based on zero crossings only, it just needs a one bit A to D conversion[2]. This feature extraction is computationally very simple and does not require the speech signal to be pre-processed.

This feature vector is basically the histogram of the time interval between successive zero-crossings of the utterance in a short time window[1]. These feature vectors for each window are then combined together to form a feature matrix. Since we are dealing with only small time series (isolated words), we can employ Dynamic Time Warping to compare the input matrix with the reference matrix’ stored. I will discuss this in another post sometime.

To obtain this vector the following steps need to be followed.

1. The speech signal x(t) is band-pass filtered to give s(t).

2. s(t) is then subjected to infinite amplitude clipping with the help of a ZCD to give u(t).

3. u(t) is then sampled at say 8Khz to give u[n]. The feature extraction is carried out on u[n].

4. u[n] is divided in a number of short time windows for every one of the calculated W samples.

5. The histogram for each of this short time window is found. The histogram(or vector) is found as follows: The number of times ONLY ONE sample is recorded between successive zero crossings will constitute the element number 1 of the vector. The number of times ONLY TWO samples are recorded between successive zero crossings will constitute the element number two of the feature vector and so on. In this way we construct an histogram which is an appropriate feature vector.

These vectors then can be combined for all windows to get the feature matrix. These as i said earlier can be compared using DTW/DDTW/Fast DTW or some other algorithm.

As an example take an utterance for the number three in Hindi which is spoken as “teen”. The first plot below gives the waveform for the utterance. The second plot gives the end-point detected version of the same, end point detection reduces computations (and hence the memory required) by removing the “useless” portions of the utterance which do not contain any intelligence.

The surface plot for the above utterance by me for the matrix (where, as i have mentioned implicitly the rows represent the windows and the columns represent the histogram terms) prepared is as:

References:

[1] A Microprocessor based Speech Recognizer for Isolated Hindi Digits, Ashutosh Saxena and Abhishek Singh, IEEE ACE.

[2] Zero-Crossing-Based Linear Prediction for Speech Recognition, Lipovac, Electronics Letters, pages 90-92, vol. 25 Issue 2,19 Jan 1989.

Onionesque Reality Home >>

Read Full Post »

The Motivation behind Artificial Neural Networks

Posted in Artificial Intelligence, Computing, tagged ANN, Artificial Intelligence, Artificial Neural Networks, Facts, Neural Networks, Parallel Computing on February 15, 2008| Leave a Comment »

The human brain is an incredibly impressive information processor, even though it “works” quite a bit slower than an ordinary computer. Many researchers in artificial intelligence look to the organization of the brain as a model for building intelligent machines. Think of a sort of “analogy” between the complex webs of interconnected neurons in a brain and the densely interconnected units making up an artificial neural network (ANN), where each unit–just like a biological neuron–is capable of taking in a number of inputs and producing an output. Consider this description: “To develop a feel for this analogy, let us consider a few facts from neurobiology. The human brain is estimated to contain a densely interconnected network of approximately 10¹¹ neurons, each connected, on average, to 10⁴ others. Neuron activity is typically excited or inhibited through connections to other neurons. The fastest neuron switching times are known to be on the order of 10^-3 seconds—quite slow compared to computer switching speeds of 10^-10 seconds. Yet humans are able to make surprisingly complex decisions, surprisingly quickly. For example, it requires approximately 10^-1 seconds to visually recognize your mother.Notice the sequence of neuron firings that can take place during this 10^-1-second interval cannot possibly be longer than a few hundred steps, giving the switching speed of single neurons. This observation has led many to speculate that the information-processing abilities of biological neural systems must follow from highly parallel processes operating on representations that are distributed over many neurons. One motivation for ANN systems is to capture this kind of highly parallel computation based on distributed representations.”

via Machine Learning (Section 4.1.1; page 82) by Tom M. Mitchell, McGraw Hill Companies, Inc. (1997).]

Read Full Post »

Robots Inspired by Animals

Posted in Artificial Intelligence, Nature Inspired Computing, Robotics, tagged Artificial Intelligence, Nature Inspired Robots, Robotics, Robots on February 4, 2008| Leave a Comment »

There have been quite a few posts on nature inspired computing. Here is another, this is a really cool video on the same. Have a look at the gait of the yellow salamander mimic bot and also the caterpillar bot at the end of the video!

Have fun!

The introduction to this video reads as:

Robotics researchers are increasingly turning to nature for inspiration. Watch a robotic salamander, a water strider robot, mechanical cockroaches and some cool self-configuring robots.

Footage courtesy of: University of Essex, Ecole Polytechnique Federale de Lausanne, Carnegie Mellon University, ULB-EPFL, Tokyo Institute of Technology, National Institute of Advanced Science and Technology (AIST).

My congratulations and best wishes to the researchers who are trying to develop such bots and also to New Scientist for such a good video!

Read Full Post »

Older Posts »

	Vihan on 1966 Film on John von Neu…
	Shubhendu Trivedi on Strangeness Minus Three (Richa…
	Shubhendu Trivedi on “Darwinian Evolution is…
	Deep on “Darwinian Evolution is…
	Deep on Strangeness Minus Three (Richa…
	Ofir Gagulashvily on Maximum Number of Regions in A…
	Perception and Visio… on Blind Geometers
	Andrea on A Note on the Graph Lapla…
	eJournal (Final Entr… on A Fractal with Zero Area and a…
	Deep on Where does the Sigmoid in Logi…

..::A Random Walk::..

Posts Tagged ‘Artificial Intelligence’

Introduction to Neural Networks (MIT):

Visualization:

Advanced AI Techniques:

Astrophysical Chemistry:

Blog Stats

Visitor Locations

Follow Blog via Email

Contact Me

Recent Comments

Pages

Archives

Quote for the Week

A.I / Machine Learning Blogs

Biology / Neurobiology

Foundations

Geek Humor

Mathematics

MATLAB

Movies and Digital Art

Other Blogs

Physics

Professional Associations

Random

Technology

Top Posts

Top Clicks

Recent Posts

Meta