Feeds:
Posts

## Maximum Number of Regions in Arrangement of Hyperplanes

A semi-useful fact for people working in Machine Learning.

Blurb: Recently, my supervisor was kind enough to institute a small reading group on Deep Neural Networks to help us understand them better. A side product of this would hopefully be that I get to explore this old interest of mine (I have an old interest in Neural Nets and have always wanted to study them in detail, especially work from the mid 90s when the area matured and connections to many others areas such as statistical physics became apparent. But before I got to do that, they seem to be back in vogue! I dub this resurgence as Connectionism 2.0 Although the hipster in me dislikes hype, it is always lends a good excuse to explore an interest). I also attended a Summer School on the topic at UCLA, find slides and talks over here.

Coming back: we have been making our way through this Foundations and Trends volume by Yoshua Bengio (which I quite highly recommend). Bengio spends a lot of time giving intuitions and arguments on why and when Deep Architectures are useful. One particular argument (which is actually quite general and not specific to trees as it might appear) went like this:

Ensembles of trees (like boosted trees, and forests) are more powerful than a single tree. They add a third level to the architecture which allows the model to discriminate among a number of regions exponential in the number of parameters. As illustrated in […], they implicitly form a distributed representation […]

This is followed by an illustration with the following figure:

[Image Source: Learning Deep Architectures for AI]

and accompanying text to the above figure:

Whereas a single decision tree (here just a two-way partition) can discriminate among a number of regions linear in the number of parameters (leaves), an ensemble of trees can discriminate among a number of regions exponential in the number of trees, i.e., exponential in the total number of parameters (at least as long as the number of trees does not exceed the number of inputs, which is not quite the case here). Each distinguishable region is associated with one of the leaves of each tree (here there are three 2-way trees, each defining two regions, for a total of seven regions). This is equivalent to a multi-clustering, here three clusterings each associated with two regions.

Now, the following question was considered: Referring to the text in boldface above, is the number of regions obtained exponential in the general case? It is easy to see that there would be cases where it is not exponential. For example: the number of regions obtained by the three trees would be the same  as those obtained by one tree if all three trees overlap, hence giving no benefit. The above claim refers to a paper (also by Yoshua Bengio) where he constructs examples to show the number of regions could be exponential in some cases.

But suppose for our satisfaction we are interested in actual numbers and the following general question, an answer to which should also answer the question raised above:

What is the maximum possible number of regions that can be obtained in ${\bf R}^n$ by the intersection of $m$ hyperplanes?

Let’s consider some examples in the ${\bf R}^2$ case.

Where there is one hyperplace i.e. $m = 1$, then the maximum number of possible regions is obviously two.

[1 Hyperplane: 2 Regions]

Clearly, for two hyerplanes, the maximum number of possible regions is 4.

[2 Hyperplanes: 4 Regions]

In case there are three hyperplanes, the maximum number of regions that might be possible would be 7 as illustrated in the first figure. For $m =4$ i.e. 4 hyerplanes, this number is 11 as shown below:

[4 Hyperplanes: 11 Regions]

On inspection we can see the number of regions with $m$ hyperplanes in ${\bf R}^2$ is given by:

Number of Regions = #lines + #intersections + 1

Now let’s consider the general case: What is the expression that would give us the maximum number of regions possibles with $m$ hyperplanes in ${\bf R}^n$? Turns out that there is a definite expression for the same:

Let $\mathcal{A}$ be an arrangement of $m$ hyperplanes in ${\bf R}^n$, then the maximum number of regions possible is given by

$\displaystyle r(\mathcal{A}) = 1 + m + \dbinom{m}{2} \ldots + \dbinom{m}{n}$

the number of bounded regions (closed from all sides) is given by:

$\displaystyle b(\mathcal{A}) = \dbinom{m - 1}{n}$

The above expressions actually derive from a more general expression when $\mathcal{A}$ is specifically a real arrangement.

I am not familiar with some of the techniques that are required to prove this in the general case ($\mathcal{A}$ is a set of affine hyperplanes in a vector space defined over some Field).  The above might be proven by induction. However, it turns out that in the ${\bf R}^2$ case the result is related to the crossing lemma (see discussion over here), I think it was one of the most favourite results of one of my previous advisors and he often talked about it. This connection makes me want to study the details [1],[2], which I haven’t had the time to look at and I will post the proof for the general version in a future post.

________________

References:

1. An Introduction to Hyperplane Arrangements: Richard P. Stanley (PDF)

Related:

2. Partially Ordered Sets: William Trotter (Handbook of Combinatorics) (PDF)

________________

Onionesque Reality Home >>

## IPAM-UCLA Summer School on Deep Learning and Feature Learning

Deep Learning reads Wikipedia and discovers the meaning of life – Geoff Hinton.

The above quote is from a very interesting talk by Geoffrey Hinton I had the chance to attend recently.

I have been at a summer school on Deep Neural Nets and Unsupervised Featured Learning at the Institute for Pure and Applied Mathematics at UCLA since July 9 (till July 27). It has been organized by Geoff Hinton, Yoshua Bengio, Stan Osher, Andrew Ng and Yann LeCun.

I have always been a “fan” of Neural Nets and the recent spike in interest in them has made me excited, thus the school happened at just the right time. The objective of the summer school is to give a broad overview of some of the recent work in Deep Learning and Unsupervised Feature Learning with emphasis on optimization, deep architectures and sparse representations. I must add that after getting here and looking at the peer group I would consider myself lucky to have obtained funding for the event!

[Click on the above image to see slides for the talks. Videos will be added at this location after July 27 Videos are now available]

That aside, if you are interested in Deep Learning or Neural Networks in general, the slides for the talks are being uploaded over here (or click on the image above), videos will be added at the same location some time after the summer school ends so you might like to bookmark this link.

The school has been interesting given the wide range of people who are here. The diversity of opinions about Deep Learning itself has given a good perspective on the subject and the issues and strengths of it. There are quite a few here who are somewhat skeptical of deep learning but are curious, while there are some who have been actively working on the same for a while. Also, it has been enlightening to see completely divergent views between some of the speakers on key ideas such as sparsity. For example Geoff Hinton had a completely different view of why sparsity was useful in classification tasks than compared to Stéphane Mallat, who gave a very interesting talk today even joking that “Hinton and Yann LeCun told you why sparsity is useful, I’ll tell you why sparsity is useless. “. See the above link for more details.

Indeed, such opinions do tell you that there is a lot of fecund ground for research in these areas.

I have been compiling a reading list on some of this stuff and will make a blog-post on the same soon.

________________

Onionesque Reality Home >>

## Some Interesting Courses: Neural Nets, Visualization, ML and Astrophysical Chemistry

Here are a number of interesting courses, two of which I am looking at for the past two weeks and that i would hopefully finish by the end of August-September.

## Introduction to Neural Networks (MIT):

These days, amongst the other things that I have at hand including a project on content based image retrieval. I have been making it a point to look at a MIT course on Neural Networks. And needless to say, I am getting to learn loads.

I would like to emphasize that though I have implemented a signature verification system using Neural Nets, I am by no means good with them. I can be classified a beginner. The tool that I am more comfortable with are Support Vector Machines.

I have been wanting to know more about them for some years now, but I never really got the time or you can say the opportunity. Now that I can invest some time, I am glad I came across this course. So far I have been able to look at 7 lectures and I should say that I am MORE than very happy with the course. I think it is very detailed and extremely well suited for the beginner as well as the expert.

The instructor is H. Sebastian Seung who is the professor of computational neuroscience at the MIT.

The course has 25 lectures each one packed with a great amount of information. Meaning, the lectures might work slow for those who are not very familiar with this stuff.

The video lectures can be accessed over here. I must admit that i am a little disappointed that these lectures are not available on you-tube. That’s because the downloads are rather large in size. But I found them worth it any way.

The lectures cover the following:

Lecture 1: Classical neurodynamics
Lecture 2: Linear threshold neuron
Lecture 3: Multilayer perceptrons
Lecture 4: Convolutional networks and vision
Lecture 5: Amplification and attenuation
Lecture 6: Lateral inhibition in the retina
Lecture 7: Linear recurrent networks
Lecture 8: Nonlinear global inhibition
Lecture 9: Permitted and forbidden sets
Lecture 10: Lateral excitation and inhibition
Lecture 11: Objectives and optimization
Lecture 12: Excitatory-inhibitory networks
Lecture 13: Associative memory I
Lecture 14: Associative memory II
Lecture 15: Vector quantization and competitive learning
Lecture 16: Principal component analysis
Lecture 17: Models of neural development
Lecture 18: Independent component analysis
Lecture 19: Nonnegative matrix factorization. Delta rule.
Lecture 20: Backpropagation I
Lecture 21: Backpropagation II
Lecture 22: Contrastive Hebbian learning
Lecture 23: Reinforcement Learning I
Lecture 24: Reinforcement Learning II
Lecture 25: Review session

The good thing is that I have formally studied most of the stuff after lecture 13 , but going by the quality of lectures so far (first 7), I would not mind seeing them again.

Course Video Lectures.

Prof H. Sebastian Seung’s Homepage.

_____

## Visualization:

This is a Harvard course. I don’t know when I’ll get the time to have a look at this course, but it sure looks extremely interesting. And I am sure a number of people would be interested in having a look at it. It looks like a course that be covered up pretty quickly actually.

The course description says the following:

The amount and complexity of information produced in science, engineering, business, and everyday human activity is increasing at staggering rates. The goal of this course is to expose you to visual representation methods and techniques that increase the understanding of complex data. Good visualizations not only present a visual interpretation of data, but do so by improving comprehension, communication, and decision making.

In this course you will learn how the human visual system processes and perceives images, good design practices for visualization, tools for visualization of data from a variety of fields, collecting data from web sites with Python, and programming of interactive visualization applications using Processing.

The topics covered are:

• Data and Image Models
• Visual Perception & Cognitive Principles
• Color Encoding
• Design Principles of Effective Visualizations
• Interaction
• Graphs & Charts
• Trees and Networks
• Higher-dimensional Data
• Unstructured Text and Document Collections
• Images and Video
• Scientific Visualization
• Medical Visualization
• Social Visualization
• Visualization & The Arts

Course Syllabus.

Lectures, Slides and other materials.

Video Lectures

_____

This is one course that I would  be looking at some parts of after I have covered the course on Neural Nets.  I am yet to glance at the first lecture or the materials, so i can not say how they would be like. But I sure am expecting a lot from them going by the topics they are covering.

The topics covered in a broad sense are:

• Bayesian Networks
• Statistical NLP
• Reinforcement Learning
• Bayes Filtering
• Distributed AI and Multi-Agent systems
• An Introduction to Game Theory

Course Home.

_____

## Astrophysical Chemistry:

I don’t know if I would be able to squeeze in time for these. But because of my amateurish interest in chemistry (If I were not an electrical engineer, I would have been into Chemistry), and because I have very high regard for Dr Harry Kroto (who is delivering them) I would try and make it a point to have a look at them. I think I’ll skip gym for some days to have a look at them. ;-)

[Nobel Laureate Harry Kroto with a Bucky-Ball model – Image Source : richarddawkins.net]

Dr Harold Kroto’s Homepage.

Astrophysical Chemistry Lectures

_____

Onionesque Reality Home >>

## Colony Cognitive Maps

Some posts back, i posted on Non-Human Art or Swarm Paintings, there I mentioned that those paintings were NOT random but were a Colony Cognitive Map.

This post will serve as the conceptual basis for the Swarm Paintings post, the next post and a few future posts on image segmentation.

Motivation: Some might wonder what is the point of writing about such a topic. And that it is totally unrelated to what i write about generally. No! That is not the case. Most of the stuff I write about is related in some sense. Well the motivation for reading thoroughly about this (and writing) maybe condensed into the following:

1. The idea of a colony cognitive map is used in SI/A-life experiments, areas that really interest me.

2. Understanding the idea of colony cognitive maps gives a much better understanding of the inherent self organization in insect swarms and gives a lead to understand self organization in general.

3. The parallel to colony cognitive maps, the cognitive maps follow from cognitive science and brain science. Again areas that really interest me as they hold the key for the REAL artificial intelligence evolution and development in the future.

The term “Colony Cognitive Map” as i had pointed earlier is in a way a parallel to a Cognitive Map in brain science (i use the term brain science for a combination of fields like neuroscience, Behavioral psychology, cognitive sciences and the likes and will use it in this meaning in this post ) and also that the name is inspired from the same!

There is more than just a romantic resemblance between the self-organization of “simple” neurons into an intelligent brain like structure, producing behaviors well beyond the capabilities of an individual neuron and the self-organization of simple and un-intelligent insects into complex swarms and producing intelligent and very complex and also aesthetically pleasing behavior! I have written previously on such intelligent mass behavior. Consider another example, neurons are known to transmit neurotransmitters in the same way a social insect colony is marked by pheromone deposition and laying.

[Self Organization in Neurons (Left) and a bird swarm(Below).  Photo Credit >> Here and Here]

First let us try to revisit what swarm intelligence roughly is (yes i still am to write a post on a mathematical definition of the same!), Swarm Intelligence is basically a property of a system where the collective actions of unsophisticated agents, acting locally causes functional and sophisticated global patterns to emerge. Swarm intelligence gives a scheme to explore decentralized problem solving. An example that is also one of my favorites is that of a bird swarm, wherein the collective behaviors of birds each of which is very simple causes very complex global patterns to emerge. Over which I have written previously, don’t forget to look at the beautiful video there if you have not done so already!

Self Organization in the Brain: Over the last two months or so i had been reading Douglas Hofstadter’s magnum opus, Gödel, Escher, Bach: an Eternal Golden Braid (GEB). This great book makes a reference to the self organization in the brain and its comparison with the behavior of the ant colonies and the self organization in them as early as 1979.

[Photo Source: Wikipedia Commons]

A brain is often regarded as one of the most if not the most complex entity. However if we look at a rock it is very complex too, but then what makes a brain so special? What distinguishes the brain from something like a rock is the purposeful arrangement of all the elements in it. The massive parallelism and self organization that is observed in it too amongst other things makes it special. Research in Cybernetics in the 1950s and 1960s lead the “cyberneticians” to try to explain the complex reactions and actions of the brain without any external instruction in terms of self organization. Out of these investigations the idea of neural networks grew out (1943 – ), which are basically very simplified models of how neurons interact in our brains. Unlike the conventional approaches in AI there is no centralized control over a neural network. All the neurons are connected to each other in some way or the other but just like the case in an ant colony none is in control. However together they make possible very complex behaviors. Each neuron works on a simple principle. And combinations of many neurons can lead to complex behavior, an example believed to be due to self-organization. In order to help the animal survive in the environment the brain should be in tune with it too. One way the brain does it is by constantly learning and making predictions on that basis. Which means a constant change and evolution of connections.

Cognitive Maps: The concept of space and how humans perceive it has been a topic that has undergone a lot of discussion in academia and philosophy. A cognitive map is often called a mental map, a mind map, cognitive model etc.

The origin of the term Cognitive Map is largely attributed to Edward Chace Tolman, here cognition refers to mental models that people use to perceive, understand and react to seemingly complex information. To understand what a mental model means it would be favorable to consider an example I came across on wikipedia on the same. A mental model is an inherent explanation in somebody’s thought process on how something works in the spatial or external world in general. It is hypothesized that once a mental model for something or some representation is formed in the brain it can replace careful analysis and careful decision making to reduce the cognitive load. Coming back to the example consider a mental model in a person of perceiving the snake as dangerous. A person who holds this model will likely rapidly retreat as if is like a reflex without initial conscious logical analysis. And somebody who does not hold such a model might not react in the same way.

Extending this idea we can look at cognitive maps as a method to structure, organize and store spatial information in the brain which can reduce the cognitive load using mental models and and enhance quick learning and recall of information.

In a new locality for example, human way-finding involves recognition and appreciation of common representations of information such as maps, signs and images so to say. The human brain tries to integrate and connect this information into a representation which is consistent with the environment and is a sort of a “map”. Such spatial (not necessarily spatial) internal representations formed in the brain can be called a cognitive map. As the familiarity of a person with an area increases then the reliance on these external representations of information gradually reduces. And the common landmarks become a tool to localize within a cognitive map.

Cognitive maps store conscious perceptions of the sense of position and direction and also the subconscious automatic interconnections formed as a result of acquiring spatial information while traveling through the environment. Thus they (cognitive maps) help to determine the position of a person, the positioning of objects and places and the idea of how to get from one place unto another. Thus a cognitive map may also be said to be an internal cognitive collage.

Though metaphorically similar the idea of a cognitive map is not really similar to a cartographic map.

Colony Cognitive Maps: With the above general background it would be much easier to think of a colony cognitive map. As it is basically a analogy to the above. As described in my post on adaptive routing, social insects such as ants construct trails and networks of regular traffic via a process of pheromone deposition, positive feedback and amplification by the trail following. These are very similar to cognitive maps. However one obvious difference lies in the fact that cognitive maps lie inside the brain and social insects such as ants write their spatial memories in the external environment.

Let us try to picture this in terms of ants, i HAVE written about how a colony cognitive map is formed in this post without mentioning the term.

A rather indispensable aspect of such mass communication as in insect swarms is Stigmergy. Stigmergy refers to communication indirectly, by using markers such as pheromones in ants. Two distinct types of stigmergy are observed. One is called sematectonic stigmergy, it involves a change in the physical environment characteristics.An example of sematectonic stigmergy is nest building wherein an ant observes a structure developing and adds its ball of mud to the top of it. Another form of stigmergy is sign-based and hence indirect. Here something is deposited in the environment that makes no direct contribution to the task being undertaken but is used to influence the subsequent behavior that is task related. Sign based stigmergy is very highly developed in ants. Ants use chemicals called as pheromones to develop a very sophisticated signaling system. Ants foraging for food lay down some pheromone which marks the path that they follow. An isolated ant moves at random but an ant encountering a previously laid trail will detect it and decide to follow it with a high probability and thereby reinforce it with a further quantity of pheromone. Since the pheromone will evaporate the lesser used paths will gradually vanish. We see that this is a collective behavior.

Now we assume that in an environment the actors (say for example ants) emit pheromone at a set rate. Also there is a constant rate at which the pheromone evaporates. We also assume that the ants themselves have no memory of previous paths taken and act ONLY on the basis of the local interactions with pheromone concentrations in the vicinity. Now if we consider the “field” or “map” that is the overall result and formed in the environment as a result of the movements of the individual ants over a fixed period of time. Then this “pheromonal” field contains information about past movements and decisions of the individual ants.

The pheromonal field (cognitive map) as i just mentioned contains information about past movements and decisions of the organisms, but not arbitrarily far in the past since the field “forgets” its distant history due to evaporation in time. Now this is exactly a parallel to a cognitive map, with the difference that for a colony the spatial information is written in the environment unlike inside the brain in the case of a human cognitive map. Another major similarity is that neurons release a number of neurotransmitters which can be considered to  be a parallel to the pheromones released as described above! The similarities are striking!

Now if i look back at the post on swarm paintings, then we can see that the we can make such paintings, with the help of a swarm of robots. More pheromone concentration on a path means more paint. And hence the painting is NOT random but is EMERGENT. I hope i could make the idea clear.

How Swarms Build Colony Cognitive Maps: Now it would be worthwhile to look at a simple model of how ants construct cognitive maps, that I read about in a wonderful paper by Mark Millonas and Dante Chialvo. Though i have already mentioned, I’ll still sum up the basic assumptions.

Assumptions:

1. The individual agent (or ant) is memoryless.

2. There is no direct communication between the organisms.

3. There is no spatial diffusion of the pheromone deposited. It remains fixed at a point where it was deposited.

4. Each agent emits pheromone at a constant rate say $\eta$.

Stochastic Transition Probabilities:

Now the state of each agent can be described by a phase variable which contains its position $r$ and orientation $\theta$. Since the response at any given time is dependent solely on the present and not the previous history, it would be sufficient to specify the transition probability from one location $(r,\theta)$ to another place and orientation $(r',\theta')$ an instant later. Thus the movement of each individual agent can be considered roughly to be a continuous markov process whose probabilities at each and every instance of time are decided by the pheromone concentration $\sigma(x, t)$.

By using theoretical considerations, generalizations from observations in ant colonies the response function can be effectively summed up into a two parameter pheromone weight function.

$\displaystyle W(\sigma) = (1 + \frac{\sigma}{1 + \delta\varsigma})$

This weight function measures the relative probabilities in moving to a site $r$ with the pheromone density $\sigma(r)$.

Another parameter $\beta$ may be considered. This parameter measures the degree of randomness by which an agent can follow a pheromone trail. For low values of $\beta$ the pheromone concentration does not largely impact its choice but higher values do.

At this point we can define another factor $\displaystyle\frac{1}{\varsigma}$. This signifies the sensory capability. It describes the fact that the ants ability to sense pheromone decreases somewhat at higher concentrations. Something like a saturation scenario.

Pheromone Evolution: It is essential to describe how the pheromone evolves. According to an assumption already made, each agent emits pheromone at a constant rate $\eta$ with no spatial diffusion. If the pheromone at a location is not replenished then it will gradually evaporate. The pheromonal field so formed does contain a memory of the past movements of the agents in space, however because of the evaporation process it does not have a very distant memory.

Analysis: Another important parameter is the regarding the number of ants present, the density of ants $\rho_0$. Thus using all these parameters we can define a single parameter, the average pheromonal field $\displaystyle\sigma_0 = \frac{\rho_0 \eta}{\kappa}$. Where $\displaystyle \kappa$ is what i mentioned above, the rate of scent decay.

Further detailed analysis can be studied out here. With the above background it is just a matter of understanding.

[Evolution of distribution of ants : Source]

Click to Enlarge

Now after continuing with the mathematical analysis in the hyperlink above, we fix the values of the parameters.

Then a large number of ants are placed at random positions, the movement of each ant is determined by the probability $P_{ik}$.

Another assumption is that the pheromone density at each point at $t=0$ is zero. Each ant deposits pheromone at a decided rate $\eta$ and also the pheromone evaporates at a fixed rate $\kappa$.

In the above beautiful picture we the evolution of a distribution of ants on a 32×32 lattice. A pattern begins to emerge as early as the 100th time step. Weak pheromonal paths are completely evaporated and we finally get a emergent ant distribution pattern as shown in the final image.

The Conclusion that Chialvo and Millonas note is that scent following of the very fundamental type described above (assumptions) is sufficient to produce an evolution (emergence) of complex pattern of organized flow of social insect traffic all by itself. Detailed conclusion can be read in this wonderful paper!

References and Suggested for Further Reading:

2. Remembrance of places past: A History of Theories of Space. click here >>

3. The Science of Self Organization and Adaptivity, Francis Heylighen, Free University of Brussels, Belgium. Click here >>

5. The Self-Organization in the Brain, Christoph von der Malsburg, Depts for Computer Science, Biology and Physics, University of Southern California.

5. How Swarms Build Cognitive Maps, Dante R. Chialvo and Mark M. Millonas, The Santa Fe Institute of Complexity. Click here >>

6. Social Cognitive Maps, Swarm Collective Perception and Distributed Search on Dynamic Landscapes, Vitorino Ramos, Carlos Fernandes, Agostinho C. Rosa.

Related Posts:

Possibly Related:

Gödel, Escher, Bach: A Mental Space Odyssey