This page is continually getting updated.
I intend to use this page for four purposes:
To list some of the projects that I have undertaken, researched on and completed successfully; some courses that I have taught and assisted; papers that I have pubished and a reading list of some material that I want to study but which is not directly related to my present research (but indirectly is).
Aside: About page (lists interests and contact information)
In reverse chronological order:
1. Practical Regularity Lemma and its Applications.
Time Period: October 2011 onwards
The Szemerédi regularity lemma is a celebrated result in graph theory whose fundamental importance has been realized more and more over the past few years. It was originally advanced as an auxiliary result to prove a long standing conjecture of Erdős and Turán from 1936. While it has now become an important tool in graph theory to prove theoretical results, it does not any practical applications yet. This project aims to harness the power of the regularity lemma to devise a practical clustering/segmentation methodology using both the Alon et al. algorithm and Frieze-Kannan’s version based on singular values. This builds up on some earlier work by Pelillo. Some of the open problems and results are discussed informally in the posts listed below:
New: Using this “Regularity Clustering” method for Image Segmentation and comparisons with methods such as Normalized Cuts (Shi and Malik).
– Importing the Szemeredi Regularity Lemma into Machine Learning. (January 7 2012)
2. Out of Sample Extensions to Spectral Clustering and their Utility for Prediction
Collaborators: Gábor N. Sárközy*, Zachary Pardos and Neil T. Heffernan*.
Time Period: August 2011 – December 2011
The problem of out of sample extension in spectral clustering is an important issue. This occurs because Spectral Clustering is an inherently “offline” method. Once a clustering has been found, it is not straight-forward to add new points to existing clusters. Note that this is quite easy in centroid based methods such as k-means where this might be simply done by looking at the various cluster centroids and adding a point to the cluster with which it’s distance is the minimum. For spectral clustering to add a new point to a cluster would involve re-calculating the Graph Laplacian (and eigen-vectors) which is a computationally up-hill task. Thus, there is the need for methods that can add new points to clusters without recalculating the laplacian and also without adding to much noise. In this project we explore various methods for solving this problem including learning a Mahalanobis distance, using k-nearest neighbor solutions, machine learning the eigen-function, Using the Nystrom Method etc. We also develop a hypergraph based bagging methodology that exploits clustering for prediction which uses the out-of-sample extensions for testing new points. We then evaluate the out-of-sample extensions in two sets of experiments: with regard to both clustering accuracy and how much of a boost they give to an associated prediction task by using our bagging framework.
3. Unsupervised Feature Learning for Face Recognition.
4. Utility of Clustering to Generate a Mixture of Experts (for Bootstrap-Aggregation) and hence an ensemble for prediction
5. Unsupervised Knowledge Discovery in Intelligent Tutor Datasets for a variety of interesting tasks
6. Image Fusion
7. SVM and ICA for contrast enhancement in Magnetic Resonance Images
8. Content Based Bio-Medical image retrieval system using Gabor Wavelets and Generic Fourier Descriptors.
9. Investigations on Kernel Selection and Parameter Tuning of Support Vector Machines.
10. An online/offline face-authentication system using Eigenfaces (and ICA) and Support Vector Machines
11. A Speaker Dependent Speech Recognizer for Isolated Hindi Digits on the 8051 Micro-Controller.
Collaborators: Ashutosh Modi, Arun Muralidharan and Dr. K. R. Joshi*
Time Period: January 2008 – March 2008
Suppose the goal is to make a low cost (< 1 $) speech recognizer for isolated digits (e.g. “Left”, “Right” etc) which could be used for simple machine control. To satisfy these budgetary constraints one option to use is a 8 bit microprocessor and not a DSP. A standard way might be to implement speech recognizers using Hidden Markov Models or Artificial Neural Networks. Generally, Mel Frequency Cepstrum Coefficients (MFCC) or Linear Predictive Coefficients are taken as features. All of these require computations that are beyond the scope of a simple processor/ micro controller like the 8051 and thus can not be implemented on the same. So there is a need for a simpler algorithm. This project explores the idea of only using zero-crossings for extracting features along with Dynamic Time Warping for isolated work recognition under minimal computational power.Variations to the basic Dynamic Time Warping idea were also explored.
As Visiting Lecturer/Co-Lecturer
1. Bio-Informatics (Visiting Lecturer)
2. Image Processing Lab (Instructor)
3. Digital Image Processing (Visiting Lecturer)
As Teaching Assistant
1. Design and Analysis of Algorithms (Comp. Sci – Senior Year)
2. Introduction to Algorithms (Comp. Sci. – Sophomore)
3. Foundations of Computer Science (Comp. Sci. – Junior Year)
4. Introduction to Artificial Intelligence (Comp. Sci. – Senior Year)
5. Discrete Mathematics (Mathematics Dept. – Sophomore)
6. Algorithms (Comp. Sci. – Sophomore) (second time)
7. Foundations of Computer Science (Comp. Sci. – Junior Year) (second time)
8. Machine Organization and Assembly Language (Comp. Sci. Sophomore)
Arranged in reverse chronological order. Prepared manuscripts are also listed (and being more recent are at the top).
1. Practical Regularity Lemma and it’s Applications (With Gábor Sárközy, Fei Song and Endre Szemerédi) Manuscript. TBD.
2. Practical Regularity Lemma and it’s Applications (With Gábor Sárközy, Fei Song and Endre Szemerédi). In Preparation. To be submitted to Elsevier Applied Mathematics and Computation.
This is a journal version of 1. And is going to have more results. One of them being the use of the Frieze Kannan Algorithm (1 uses only Alon et al.). Ideas on extending this to Hypergraphs.
3. Co-Clustering by Spectral Bipartite Graph Partitioning for Student Modeling (with Zach Pardos, Gábor Sárközy and Neil T. Heffernan ). In the Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, pp. 33-40, 2012.
4. The Real World Significance of Performance Prediction (with Zach Pardos and Qing Yang Wang). In the Proceedings of the 5th International Conference on Educational Data Mining, Chania, Greece, pp. 192-195, 2012.
5. Clustered Knowledge Tracing (With Zach Pardos, Gábor Sárközy, Neil Heffernan)., In the Proceedings of the 11th International Conference on Intelligent Tutoring Systems, S.A. Cerri and B. Clancey (Eds.): ITS 2012, LNCS 7315, pp. 407–412, 2012, Springer-Verlag Berlin Heidelberg 2012.
6. Out of Sample Extensions to Spectral Clustering (With Gábor Sárközy, Zach Pardos and Neil T. Heffernan).Under Review, Springer Statistics and Computing Journal, 2012.
7. The Utility of Clustering in Prediction Tasks (With Zach Pardos and Neil T. Heffernan). Under Review, IEEE Transactions on Systems, Man and Cybernetics, (September 2011).
8. Spectral Clustering in Educational Data Mining (With Gábor Sárközy, Zach Pardos and Neil T. Heffernan). TBD, Journal of Educational Data Mining (2012).
9. Spectral Clustering in Educational Data Mining (With Gábor Sárközy, Zach Pardos and Neil T. Heffernan). In Proceedings of the 4th International Conference on Educational Data Mining 2011, Eindhoven Netherlands, pp. 129-138, 2011.
This is a conference version of (8).
10. Clustering Students to generate an ensemble to Improve Standard Test Score Predictions (With Zach Pardos and Neil T. Heffernan). G. Biswas et al. (Eds.): AIED 2011, LNAI 6738, In The proceedings of the 15th International Conference on Artificial Intelligence in Education 2011, Auckland, New Zealand, pp. 377–384, 2011.
The co-ordinates are (x,y) such that x is the number of papers and y the time of the first publication.
Fellow Student Collaborators
This section is to maintain a reading list of technical material that I wish to work my way through over the next few months. This is especially in areas that are not directly related to my work but stimulate me enough and would potentially help later.