Feeds:
Posts

Implementation and Abstraction in Mathematics

I recently noticed on arXiv that the following manuscript “Implementation and Abstraction in Mathematics” by David McAllester. A couple of years ago, I had taken a graduate course taught by David that had a similar flavour (the material in the manuscript is more advanced, in particular the main results, not to mention it is better organized and the presentation more polished), presenting a type theoretic foundation of mathematics. Although I can’t say I did very well in the course, I certainly enjoyed the ideas in it very much, and thus the above manuscript might be worth a look. Perhaps it might be a good idea to audit that course again, just to make sure I understand the main ideas better this time. :)

Some Proofs of the Cauchy-Schwarz Inequality

Over the past 4-5 months whenever there is some time to spare, I have been working through The Cauchy-Schwarz Master Class by J. Michael Steele. And, although I am still left with the last two chapters, I have been reflecting on the material already covered in order to get a better perspective on what I have been slowly learning over the months. This blog post is a small exercise in this direction.

Ofcourse, there is nothing mysterious about proving the Cauchy-Schwarz inequality; it is fairly obvious and basic. But I thought it still might be instructive (mostly to myself) to reproduce some proofs that I know out of memory (following a maxim of my supervisor on a white/blackboard blogpost). Although, why Cauchy-Schwarz keeps appearing all the time and what makes it so useful and fundamental is indeed quite interesting and non-obvious. And like Gil Kalai notes, it is also unclear why is it that it is Cauchy-Schwarz which is mainly useful. I feel that Steele’s book has made me appreciate this importance somewhat more (compared to 4-5 months ago) by drawing to many concepts that link back to Cauchy-Schwarz.

Before getting to the post, a word on the book: This book is perhaps amongst the best mathematics book that I have seen in many years. True to its name, it is indeed a Master Class and also truly addictive. I could not put aside the book completely once I had picked it up and eventually decided to go all the way. Like most great books, the way it is organized makes it “very natural” to rediscover many susbtantial results (some of them named) appearing much later by yourself, provided you happen to just ask the right questions. The emphasis on problem solving makes sure you make very good friends with some of the most interesting inequalities. The number of inequalities featured is also extensive. It starts off with the inequalities dealing with “natural” notions such as monotonicity and positivity and later moves onto somewhat less natural notions such as convexity. I can’t recommend this book enough!

Now getting to the proofs: Some of these proofs appear in Steele’s book, mostly as either challenge problems or as exercises. All of them were solvable after some hints.

________________

Proof 1: A Self-Generalizing proof

This proof is essentially due to Titu Andreescu and Bogdan Enescu and has now grown to be my favourite Cauchy-Schwarz proof.

We start with the trivial identity (for $a, b \in \mathbb{R}, x >0, y>0$):

Identity 1: $(ay - bx)^2 \geq 0$

Expanding we have

$a^2y^2 + b^2x^2 - 2abxy \geq 0$

Rearranging this we get:

$\displaystyle \frac{a^2y}{x} + \frac{b^2x}{y} \geq 2ab$

Further: $\displaystyle a^2 + b^2 + \frac{a^2y}{x} + \frac{b^2x}{y} \geq (a+b)^2$;

Rearranging this we get the following trivial Lemma:

Lemma 1: $\displaystyle \frac{(a+b)^2}{(x+y)} \leq \frac{a^2}{x} + \frac{b^2}{y}$

Notice that this Lemma is self generalizing in the following sense. Suppose we replace $b$ with $b + c$ and $y$ with $y + z$, then we have:

$\displaystyle \frac{(a+b+c)^2}{(x+y+z)} \leq \frac{a^2}{x} + \frac{(b+c)^2}{y+z}$

But we can apply Lemma 1 to the second term of the right hand side one more time. So we would get the following inequality:

$\displaystyle \frac{(a+b+c)^2}{(x+y+z)} \leq \frac{a^2}{x} + \frac{b^2}{y} + \frac{c^2}{z}$

Using the same principle $n$ times we get the following:

$\displaystyle \frac{(a_1+a_2+ \dots a_n)^2}{(x_1+x_2+ \dots + x_n)} \leq \frac{a_1^2}{x_1} + \frac{a_2^2}{x_2} + \dots + \frac{a_n^2}{x_n}$

Now substitute $a_i = \alpha_i \beta_i$ and $x_i = \beta_i^2$ to get:

$\displaystyle ( \sum_{i=1}^n \alpha_i \beta_i )^2 \leq \sum_{i=1}^n ( \alpha_i )^2 \sum_{i=1}^n ( \beta_i )^2$

This is just the Cauchy-Schwarz Inequality, thus completing the proof.

________________

Proof 2: By Induction

Again, the Cauchy-Schwarz Inequality is the following: for $a, b \in \mathbb{R}$

$\displaystyle ( \sum_{i=1}^n a_i b_i )^2 \leq \sum_{i=1}^n ( a_i )^2 \sum_{i=1}^n ( b_i )^2$

For proof of the inequality by induction, the most important thing is starting with the right base case. Clearly $n = 1$ is trivially true, suggesting that it is perhaps not of much import. So we consider the case for $n = 2$. Which is:

$\displaystyle ( a_1 b_1 + a_2 b_2 )^2 \leq ( a_1^2 + a_2^2 ) (b_1^2 + b_2^2)$

To prove the base case, we simply expand the expressions. To get:

$\displaystyle a_1^2 b_1^2 + a_2^2 b_2^2 + 2 a_1 b_1 a_2 b_2 \leq a_1^2 b_1^2 + a_1^2 b_2^2 + a_2^2 b_1^2 + a_2^2 b_2^2$

Which is just:

$\displaystyle a_1^2 b_2^2 + a_2^2 b_1^2 - 2 a_1 b_1 a_2 b_2 \geq 0$

Or:

$\displaystyle (a_1 b_2 - a_2 b_1 )^2 \geq 0$

Which proves the base case.

Moving ahead, we assume the following inequality to be true:

$\displaystyle ( \sum_{i=1}^k a_i b_i )^2 \leq \sum_{i=1}^k ( a_i )^2 \sum_{i=1}^k ( b_i )^2$

To establish Cauchy-Schwarz, we have to demonstrate, assuming the above, that

$\displaystyle ( \sum_{i=1}^{k+1} a_i b_i )^2 \leq \sum_{i=1}^{k+1} ( a_i )^2 \sum_{i=1}^{k+1} ( b_i )^2$

So, we start from $H(k)$:

$\displaystyle ( \sum_{i=1}^k a_i b_i )^2 \leq \sum_{i=1}^k ( a_i )^2 \sum_{i=1}^k ( b_i )^2$

we further have,

$\displaystyle \Big(\sum_{i=1}^k a_i b_i\Big) + a_{k+1}b_{k+1} \leq \Big(\sum_{i=1}^k ( a_i )^2\Big)^{1/2} \Big(\sum_{i=1}^k ( b_i )^2\Big)^{1/2} + a_{k+1}b_{k+1} \ \ \ \ (1)$

Now, we can apply the case for $n = 2$. Recall that: $\displaystyle a_1 b_1 + a_2 b_2 \leq (a_1^2 + a_2^2)^{1/2} (b_1^2 + b_2^2)^{1/2}$

Thus, using this in the R. H. S of $(1)$, we would now have:

$\displaystyle \Big(\sum_{i=1}^{k+1} a_i b_i\Big) \leq \Big(\sum_{i=1}^k ( a_i )^2 + a_{k+1}^2\Big)^{1/2} \Big(\sum_{i=1}^k ( b_i )^2 + b_{k+1}^2\Big)^{1/2}$

Or,

$\displaystyle \Big(\sum_{i=1}^{k+1} a_i b_i\Big) \leq \Big(\sum_{i=1}^{k+1} ( a_i )^2 \Big)^{1/2} \Big(\sum_{i=1}^{k+1} ( b_i )^2 \Big)^{1/2}$

This proves the case $H(k+1)$ on assuming $H(k)$. Thus also proving the Cauchy-Schwarz inequality by the principle of mathematical induction.

________________

Proof 3: For Infinite Sequences using the Normalization Trick:

Problem: For $a, b \in \mathbb{R}$ If $\displaystyle \Big(\sum_{i=1}^{\infty} a_i^2 \Big) < \infty$ and $\displaystyle \Big(\sum_{i=1}^{\infty} b_i^2 \Big) < \infty$ then is $\displaystyle \Big(\sum_{i=1}^{\infty} |a_i||b_i|\Big) < \infty$ ?

Note that this is easy to establish. We simply start with the trivial identity $(x-y)^2 \geq 0$ which in turn gives us $\displaystyle xy \leq \frac{x^2}{2} + \frac{y^2}{2}$

Next, take $x = |a_i|$ and $y = |b_i|$ on summing up to infinity on both sides, we get the following:

$\displaystyle \Big( \sum_{i=1}^{\infty} |a_i||b_i| \Big)^2 \leq \frac{1}{2}\Big(\sum_{i=1}^{\infty} a_i^2\Big) + \frac{1}{2} \Big(\sum_{i=1}^{\infty} b_i^2\Big)\ \ \ \ \ \ \ \ (2)$

From this it immediately follows that

$\displaystyle \Big(\sum_{i=1}^{\infty} |a_i||b_i|\Big) < \infty$

Now let

$\displaystyle \hat{a}_i = \frac{a_i}{\Big(\sum_j a_i^2\Big)^{1/2}}$ and

$\displaystyle \hat{b}_i = \frac{b_i}{\Big(\sum_j b_i^2\Big)^{1/2}}$; substituting in $(2)$, we get:

$\displaystyle \Big( \sum_{i=1}^{\infty} |\hat{a}_i||\hat{b}_i| \Big)^2 \leq \frac{1}{2}\Big(\sum_{i=1}^{\infty} \hat{a}_i^2\Big) + \frac{1}{2} \Big(\sum_{i=1}^{\infty} \hat{b}_i^2\Big)$ or,

$\displaystyle \Bigg(\sum_{i = 1}^{\infty}\frac{a_i}{\Big(\sum_j a_j^2\Big)^{1/2}}\frac{b_i}{\Big(\sum_j b_j^2\Big)^{1/2}}\Bigg)^2 \leq \frac{1}{2} + \frac{1}{2}$

Which simply gives back Cauchy’s inequality for infinite sequences thus completing the proof:

$\displaystyle \Big(\sum_{i=1}^{\infty} a_i b_i\Big)^2 \leq \Big(\sum_{i=1}^{\infty}a_i^2\Big) \Big(\sum_{i=1}^{\infty}b_i^2\Big)$

________________

Proof 4: Using Lagrange’s Identity

We first start with a polynomial which we denote by $\mathbf{Q}_n$:

$\mathbf{Q}_n = \big(a_1^2 + a_2^2 + \dots + a_n^2 \big) \big(b_1^2 + b_2^2 + \dots + b_n^2 \big) - \big(a_1b_1 + a_2b_2 + \dots + a_nb_n\big)^2$

The question to now ask, is $\mathbf{Q}_n \geq 0$? To answer this question, we start of by re-writing $\mathbf{Q}_n$ in a “better” form.

$\displaystyle \mathbf{Q}_n = \sum_{i=1}^{n}\sum_{j=1}^{n} a_i^2 b_j^2 - \sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_i a_jb_j$

Next, as J. Michael Steele puts, we pursue symmetry and rewrite the above so as to make it apparent.

$\displaystyle \mathbf{Q}_n = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_i^2 b_j^2 + a_j^2 b_i^2\big) - \frac{2}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_i a_jb_j$

Thus, we now have:

$\displaystyle \mathbf{Q}_n = \frac{1}{2} \sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_i b_j - a_j b_i\big)^2$

This makes it clear that $\mathbf{Q}_n$ can be written as a sum of squares and hence is always postive. Let us write out the above completely:

$\displaystyle \sum_{i=1}^{n}\sum_{j=1}^{n} a_i^2 b_j^2 - \sum_{i=1}^{n}\sum_{j=1}^{n} a_ib_ib_jb_j = \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_ib_j -a_jb_i\big)^2$

Now, reversing the step we took at the onset to write the L.H.S better, we simply have:

$\displaystyle \sum_{i}^{n} a_i^2 \sum_{}^{n} b_i^2 - \big(\sum_{i=1}^n a_ib_i\big)^2 = \frac{1}{2}\sum_{i=1}^{n}\sum_{j=1}^{n} \big(a_ib_j -a_jb_i\big)^2$

This is called Lagrange’s Identity. Now since the R.H.S. is always greater than or equal to zero. We get the following inequality as a corrollary:

$\displaystyle \big(\sum_{i=1}^n a_ib_i\big)^2 \leq \sum_{i}^{n} a_i^2 \sum_{}^{n} b_i^2$

This is just the Cauchy-Schwarz inequality, completing the proof.

________________

Proof 5: Gram-Schmidt Process gives an automatic proof of Cauchy-Schwarz

First we quickly review the Gram-Schmidt Process: Given a set of linearly independent elements of a real or complex inner product space $\big(V,\langle\cdot, \cdot\rangle\big)$, $\mathbf{x_1}, \mathbf{x_2}, \dots, \mathbf{x_n}$. We can get an orthonormal set of $n$ elemets $\mathbf{e_1}, \mathbf{e_2}, \dots, \mathbf{e_n}$ by the simple recursion (after setting $\displaystyle \mathbf{e_1 = \frac{\mathbf{x_1}}{\|x_1 \|}}$).

$\displaystyle \mathbf{z_k} = \mathbf{x_k} - \sum_{j=1}^{k-1} \langle\mathbf{x_k},\mathbf{e_j}\rangle\mathbf{e_j}$ and then

$\displaystyle \mathbf{e_k} = \frac{\mathbf{z_k}}{\|\mathbf{z_k}\|}$

for $k = 2, 3, \dots, n$.

Keeping the above in mind, assume that $\| x\| = 1$. Now let $x = e_1$. Thus, we have:

$\mathbf{z} = \mathbf{y} - \langle \mathbf{y}, \mathbf{e_1}\rangle\mathbf{e_1}$

Giving: $\displaystyle \mathbf{e_2} = \frac{\mathbf{z}}{\|\mathbf{z}\|}$. Rearranging we have:

$\displaystyle \|\mathbf{z}\|\mathbf{e_2} = \mathbf{y} - \langle \mathbf{y}, \mathbf{e_1}\rangle\mathbf{e_1}$ or

$\displaystyle \mathbf{y} = \langle \mathbf{y},\mathbf{e_1}\rangle \mathbf{e_1} + \|\mathbf{z}\|\mathbf{e_2}$ or

$\displaystyle \mathbf{y} = \mathbf{c_1} \mathbf{e_1} + \mathbf{c_2}\mathbf{e_2}$ where $c_1, c_2$ are constants.

Now note that: $\displaystyle \langle\mathbf{x},\mathbf{y}\rangle = c_1$ and

$\displaystyle \langle \mathbf{y},\mathbf{y}\rangle = |c_1|^2 + |c_2|^2$. The following bound is trivial:

$\displaystyle |c_1| \leq (|c_1|^2 + |c_2|^2)^{1/2}$. But note that this is simply $\langle x,y \rangle \leq \langle y,y \rangle^{1/2}$

Which is just the Cauchy-Schwarz inequality when $\|x\| = 1$.

________________

Proof 6: Proof of the Continuous version for d =2; Schwarz’s Proof

For this case, the inequality may be stated as:

Suppose we have $S \subset \mathbb{R}^2$ and that $f: S \to \mathbb{R}$ and $g: S \to \mathbb{R}$. Then consider the double integrals:

$\displaystyle A = \iint_S f^2 dx dy$, $\displaystyle B = \iint_S fg dx dy$ and $\displaystyle C = \iint_S g^2 dx dy$. These double integrals must satisfy the following inequality:

$|B| \leq \sqrt{A} . \sqrt{C}$.

The proof given by Schwarz as is reported in Steele’s book (and indeed in standard textbooks) is based on the following observation:

The real polynomial below is always non-negative:

$\displaystyle p(t) = \iint_S \Big( t f(x,y) + g(x,y) \Big)^2 dx dy = At^2 + 2Bt + C$

$p(t) > 0$ unless $f$ and $g$ are proportional. Thus from the binomial formula we have that $B^2 \leq AC$, moreover the inequality is strict unless $f$ and $g$ are proportional.

________________

Proof 7: Proof using the Projection formula

Problem: Consider any point $x \neq 0$ in $\mathbb{R}^d$. Now consider the line that passes through this point and origin. Let us call this line $\mathcal{L} = \{ tx: t \in \mathbb{R}\}$. Find the point on the line closest to any point $v \in \mathbb{R}^d$.

If $P(v)$ is the point on the line that is closest to $v$, then it is given by the projection formula: $\displaystyle P(v) = x \frac{\langle x, v \rangle }{\langle x, x \rangle}$

This is fairly elementary to establish. To find the value of $t$, such that distance $\rho(v,tx)$ is minimized, we can simply consider the squared distance $\rho^2(v,tx)$ since it is easier to work with. Which by definition is:

$\displaystyle \rho^2(v,tx) = \langle v - x, v - tx \rangle$

which is simply:

$\displaystyle \rho^2(v,tx) = \langle v, v \rangle - 2t \langle v, x \rangle + t^2 \langle x, x \rangle$

$\displaystyle = \langle x, x \rangle \bigg( t^2 -2t \frac{\langle v, x \rangle}{\langle x, x \rangle} + \frac{\langle v, v \rangle}{\langle x, x \rangle}\bigg)$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2}\bigg\} + \frac{\langle v, v \rangle}{\langle x, x \rangle}\bigg)$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2} + \frac{\langle v, v \rangle}{\langle x, x \rangle} \bigg\}$

$\displaystyle = \langle x, x \rangle \bigg\{ \bigg(t - \frac{\langle v,x \rangle}{\langle x,x \rangle}\bigg)^2 - \frac{\langle v,x \rangle^2}{\langle x,x \rangle^2} + \frac{\langle v, v \rangle \langle x, x \rangle}{\langle x, x \rangle^2} \bigg\}$

So, the value of $t$ for which the above is minimized is $\displaystyle \frac{\langle v,x \rangle}{\langle x,x \rangle}$. Note that this simply reproduces the projection formula.

Therefore, the minimum squared distance is given by the expression below:

$\displaystyle \min_{t \in \mathbb{R}} \rho^2(v, tx) = \frac{\langle v,v\rangle \langle x,x \rangle - \langle v,x \rangle^2}{\langle x,x \rangle}$

Note that the L. H. S is always positive. Therefore we have:

$\displaystyle \frac{\langle v,v\rangle \langle x,x \rangle - \langle v,x \rangle^2}{\langle x,x \rangle} \geq 0$

Rearranging, we have:

$\displaystyle \langle v,x \rangle^2 \leq \langle v,v\rangle \langle x,x \rangle$

Which is just Cauchy-Schwarz, thus proving the inequality.

________________

Proof 8: Proof using an identity

A variant of this proof is amongst the most common Cauchy-Schwarz proofs that are given in textbooks. Also, this is related to proof (6) above. However, it still has some value in its own right. While also giving an useful expression for the “defect” for Cauchy-Schwarz like the Lagrange Identity above.

$P(t) = \langle v - tw, v - tw \rangle$. Clearly $P(t) \geq 0$.

To find the minimum of this polynomial we find its derivative w.r.t $t$ and setting to zero:

$P'(t) = 2t \langle w, w \rangle - 2 \langle v, w \rangle = 0$ giving:

$\displaystyle t_0 = \frac{\langle v, w \rangle}{\langle w, w \rangle}$

Clearly we have $P(t) \geq P(t_0) \geq 0$. We consider:

$P(t_0) \geq 0$, substituting $\displaystyle t_0 = \frac{\langle v, w \rangle}{\langle w, w \rangle}$ we have:

$\displaystyle \langle v,v \rangle - \frac{\langle v, w \rangle}{\langle w, w \rangle} \langle v,w \rangle - \frac{\langle v, w \rangle}{\langle w, w \rangle} \langle w,v \rangle + \frac{\langle v, w \rangle^2}{\langle w, w \rangle^2}\langle w,w \rangle \geq 0 \ \ \ \ \ \ \ \ (A)$

Just rearrangine and simplifying:

$\displaystyle \langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2 \geq 0$

This proves Cauchy-Schwarz inequality.

Now suppose we are interested in an expression for the defect in Cauchy-Schwarz i.e. the difference $\displaystyle \langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2$. For this we can just consider the L.H.S of equation $(A)$ since it is just $\displaystyle \langle w,w \rangle \Big(\langle v,v \rangle \langle w,w \rangle - \langle v, w \rangle^2\Big)$.

i.e. Defect =

$\displaystyle \langle w,w \rangle \bigg(\langle v,v \rangle - 2 \frac{\langle v,w \rangle^2}{\langle w,w \rangle} + \frac{\langle v,w \rangle^2}{\langle w,w \rangle}\bigg)$

Which is just:

$\displaystyle \langle w,w \rangle\bigg(\Big\langle v - \frac{\langle w,v \rangle}{\langle w,w \rangle}w,v - \frac{\langle w,v \rangle}{\langle w,w \rangle}w \Big\rangle\bigg)$

This defect term is much in the spirit of the defect term that we saw in Lagrange’s identity above, and it is instructive to compare them.

________________

Proof 9: Proof using the AM-GM inequality

Let us first recall the AM-GM inequality:

For non-negative reals $x_1, x_2, \dots x_n$ we have the following basic inequality:

$\displaystyle \sqrt[n]{x_1 x_2 \dots x_n} \leq \Big(\frac{x_1 + x_2 + \dots x_n}{n}\Big)$.

Now let us define $\displaystyle A = \sqrt{a_1^2 + a_2^2 + \dots + a_n^2}$ and $\displaystyle B = \sqrt{b_1^2 + b_2^2 + \dots + b_n^2}$

Now consider the trivial bound (which gives us the AM-GM): $(x-y)^2 \geq 0$, which is just $\displaystyle \frac{1}{2}\Big(x^2 + y^2\Big) \geq xy$. Note that AM-GM as stated above for $n = 2$ is immediate when we consider $x \to \sqrt{x}$ and $y \to \sqrt{y}$

Using the above, we have:

$\displaystyle \frac{1}{2} \Big(\frac{a_i^2}{A^2} + \frac{b_i^2}{B^2}\Big) \geq \frac{a_ib_i}{AB}$

Summing over $n$, we have:

$\displaystyle \sum_{i=1}^n\frac{1}{2} \Big(\frac{a_i^2}{A^2} + \frac{b_i^2}{B^2}\Big) \geq \sum_{i=1}^n \frac{a_ib_i}{AB}$

But note that the L.H.S equals 1, therefore:

$\displaystyle \sum_{i=1}^n \frac{a_ib_i}{AB} \leq 1$ or $\displaystyle \sum_{i=1}^n a_ib_i \leq AB$

Writing out $A$ and $B$ as defined above, we have:

$\displaystyle \sum_{i=1}^n a_ib_i \leq \sqrt{\sum_{i=1}^na_i^2}\sqrt{\sum_{i=1}^nb_i^2}$.

Thus proving the Cauchy-Schwarz inequality.

________________

Proof 10: Using Jensen’s Inequality

We begin by recalling Jensen’s Inequality:

Suppose that $f: [p, q] \to \mathbb{R}$ is a convex function. Also suppose that there are non-negative numbers $p_1, p_2, \dots, p_n$ such that $\displaystyle \sum_{i=1}^{n} p_i = 1$. Then for all $x_i \in [p, q]$ for $i = 1, 2, \dots, n$ one has:

$\displaystyle f\Big(\sum_{i=1}^{n}p_ix_i\Big) \leq \sum_{i=1}^{n}p_if(x_i)$.

Now we know that $f(x) = x^2$ is convex. Applying Jensen’s Inequality, we have:

$\displaystyle \Big(\sum_{i=1}^{n} p_i x_i \Big)^2 \leq \sum_{i=1}^{n} p_i x_i^2$

Now, for $b_i \neq 0$ for all $i = 1, 2, \dots, n$, let $\displaystyle x_i = \frac{a_i}{b_i}$ and let $\displaystyle p_i = \frac{b_i^2}{\sum_{i=1}^{n}b_i^2}$.

Which gives:

$\displaystyle \Big(\sum_{i=1}^n \frac{a_ib_i}{\sum_{i=1}^{n}b_i^2}\Big)^2 \leq \Big(\sum_{i=1}^{n}\frac{a_i^2}{\sum_{i=1}^{n}b_i^2}\Big)$

Rearranging this just gives the familiar form of Cauchy-Schwarz at once:

$\displaystyle \Big(\sum_{i=1}^{n} a_ib_i\Big)^2 \leq \Big(\sum_{i=1}^{n} a_i^2\Big)\Big(\sum_{i=1}^{n} b_i^2\Big)$

________________

Proof 11: Pictorial Proof for d = 2

Here (page 4) is an attractive pictorial proof by means of tilings for the case $d = 2$ by Roger Nelson.

________________

1966 Film on John von Neumann

John von Neumann made so many fundamental contributions that Paul Halmos remarked that it was almost like von Neumann maintained a list of various subjects that he wanted to touch and develop and he systematically kept ticking items off. This sounds to be remarkably true if one just has a glance at the dizzyingly long “known for” column below his photograph on his wikipedia entry.

John von Neumann with one of his computers.

Since Neumann died (young) in 1957, rather unfortunately, there aren’t very many audio/video recordings of his (if I am correct just one 2 minute video recording exists in the public domain so far).

I recently came across a fantastic film on him that I would very highly recommend. Although it is old and the audio quality is not the best, it is certainly worth spending an hour on. The fact that this film features Eugene Wigner, Stanislaw UlamOskar Morgenstern, Paul Halmos (whose little presentation I really enjoyed), Herman Goldstein, Hans Bethe and Edward Teller (who I heard for the first time, spoke quite interestingly) alone makes it worthwhile.

Update: The following youtube links have been removed for breach of copyright. The producer of the film David Hoffman, tells us that the movie should be available as a DVD for purchase soon. Please check the comments to this post for more information.

Part 1

Find Part 2 here.

________________

Onionesque Reality Home >>

Ramanujan: Letters from an Indian Clerk (1987)

I have never done anything useful. No discovery of mine has made or is likely to make, directly or indirectly, for good or for ill, the least difference to the amenity of the world. Judged by all practical standards, the value of my mathematical life is nil. And outside mathematics it is trivial anyhow. The case for my life then, or for anyone else who has been a mathematician in the same sense that I have been one is this: That I have added something to knowledge and helped others to add more, and that these somethings have a value that differ in degree only and not in kind from that of the creations of the great mathematicians or any of the other artists, great or small who’ve left some kind of memorial behind them.

I still say to myself when I am depressed and and find myself forced to listen to pompous and tiresome people “Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.” — G. H. Hardy (A Mathematician’s Apology)

Yesterday I  discovered an old (1987) British documentary on Srinivasa Ramanujan, which was pretty recently uploaded. I was not surprised to see that the video was made available by Christopher J. Sykes, who has been uploading older documentaries (including those by himself) on youtube (For example – The delightful “Richard Feynman and the Quest for Tannu Tuva” was uploaded by him as well. I blogged about it a couple of years ago!). Thanks Chris for these gems!

Since the documentary is pretty old, it is a little slow. But if you have one hour to spare, you should watch it! It features his (now late) widow, a quite young Béla Bollobás and the late Nobel Laureate Subrahmanyan Chandrasekhar. The video is embedded below – in case of any issues also find it linked here.

________________

[Ramanujan: Letters from an Indian Clerk]

________________

I could have written something on Ramanujan, but decided against it. Instead, I’d close this post with an excerpt from a wonderful essay by Freeman Dyson on Ramanujan published in Ramanujan: Essays and Surveys by Berndt and Rankin

Ramanujan: Essays and Surveys (click on image to view on Amazon)

A Walk Through Ramanujan’s Garden — F. J. Dyson

[…] The inequalities (8), (9) and (10) were undoubtedly true, but I had no idea how to prove them in 1942. In the end I just gave up trying to prove them and published them as conjectures in our student magazine “Eureka”. Since there was half a page left over at the end of my paper, they put in a poem by my friend Alison Falconer who was also a poet and mathematician. […]

Short Vision

Thought is the only way that leads to life.

All else is hollow spheres

Reflecting back

In heavy imitation

And blurred degeneration

A senseless image of our world of thought.

.

Man thinks he is the thought which gives him life.

He binds a sheaf and claims it as himself.

He is a ring through which we pass swinging ropes

Which merely move a little as he slips.

.

The Ropes are Thought.

The Space is Time.

Could he but see, then he might climb.

________________

Onionesque Reality Home >>

The Opinions of Doron Zeilberger

I presume that a lot of people who drop by this blog are familiar with Doron Zeilberger‘s opinions already. Even though a lot of people who know me personally get linked frequently to some or the other opinions of Zeilberger, I thought it would be a good idea to blog about them in any case, for I believe more people should know about them, even if the number is not high enough.

For a one line introduction, Doron Zeilberger is a noted Israeli mathematician who is presently a professor at Rutgers. He maintains a “blog” which has the title “Dr. Z’s Opinions” in which there are an assortment of views on topics broadly related to Mathematics. Zeilberger certainly has a flair for writing and oftentimes makes hard-hitting points which might outrage many (his latest writing on Turing for example is sure to make many people shake their heads in disagreement – me included) which usually could be seen as chipping away at commonly held opinions. All the interestingness about his opinions aside, his sense of humour makes them entertaining in any case. Even if one disagrees with them I would highly recommend them as long as one exercises some discretion in sifting through these Indiscrete Thoughts.

I found his opinions many years ago while searching for something witty about weekly colloquiums which I could send to some of my colleagues who somehow took pride in not going for them. Skipping colloquiums is a habit that I have not understood well. He wrote the following about it (Opinion 20):

Socrates said that one should always marry. If your spouse would turn out to be nice, then you’ll be a happy person. If your spouse would turn out to be a bitch/bastard, then you’ll become a philosopher.

The same thing can be said about the weekly colloquium. If the speaker is good, you’ll learn something new and interesting, usually outside your field. If the speaker is bad, you’ll feel that you have accomplished something painful, like fasting, or running a marathon, so while you may suffer during the talk, you’ll feel much better after it.

What prompted me to blog about his “blog” was a recent opinion of his. Some months ago when Endre Szemeredi won the Abel Prize, I got very excited, almost like a school boy and the next morning I went to the college library to see what the national dailies had to say about the achievement. To my surprise and dismay none of the dailies seem to have noticed it at all! Three or four days after that the New York Times carried a full page advertisement by Rutgers University having a great photo of Szemeredi, however that doesn’t count as news. I was delighted to see that Doron Zeilberger noticed this too and wrote about it (see his 122nd Opinion)

Let me conclude by wishing Endre, “the computer science professor who never touched a computer”, many more beautiful and deep theorems, and console him (and us) that in a hundred years, and definitely in a thousand years, he would be remembered much more than any contemporary sports or movie star, and probably more than any living Nobel prize winner.

One of my all time favourite opinions of his is Opinion 62, which compares the opposing styles of genius of two men I have had the highest respect for – Israel Gelfand and Alexander Grothendieck. I often send it to people who I think are highly scientifically talented but somehow waste time in expending energy in useless causes than trying to do science (especially if one doesn’t have an intellect comparable to some fraction of Grothendieck’s)! I take the liberty of reproducing the entire opinion here –

I just finished reading Allyn Jackson’s fascinating two-part article about the great mathematical genius Alexandre Grothendieck (that appeared in the Notices of the Amer. Math. Soc.) , and Pierre Cartier’s extremely moving and deep essay `Une pays dont on ne conaitrait que le nom: Le “motifs” de Grothendieck’. (that appeared in the very interesting collection “Le Reel en mathematiques”, edited by P. Cartier and Nathalie Charraud, and that represents the proceedings of a conference about psychoanalysis and math).

In Pierre Cartier’s article, in addition to an attempt at a penetrating “psychoanalysis” he also gives a very lucid non-technical summary of Grothendieck’s mathematical contributions. From this it is clear that one of the greatest giants on whose shoulders Grothendieck stood was Israel Gelfand, whom I am very fortunate to know personally (I am one of the few (too few!) regulars that attend his weekly seminar at Rutgers). I couldn’t help notice the great contrast between these two Giants, and their opposing styles of Genius.

Myself, I am not even an amateaur psychoanalyst, but motives and psi aside, I can easily explain why Grothendieck stopped doing math a long time ago (hence, died, according to Erdos’s nomenclature), while Gelfand, at age 91, is as active and creative as ever.

First and foremost, Grothendieck is a dogmatic purist (like many of the Bourbakists). He dislikes any influences from outside mathematics, or even from other subareas of math. In particular, he always abhored mathematical physics. Ironically, as Cartier explains so well, many major applications of his ground-breaking work were achieved by interfacing it with mathematical physics, in the hands of the “Russian” school, all of whom were disciples of Gelfand. As for Combinatorics, forget it! And don’t even mention the computer, it is du diable. As for Gelfand, he was always sympathetic to all science, even biology! In fact he is also considered a prominent theoretical biologist. Gelfand also realizes the importance of combinatorics and computers.

Also people. Grothendieck was a loner, and hardly collaborated. On the other hand, Gelfand always (at least in the last sixty years) works with other people. Gelfand is also very interested in pedagogy, and in establishing math as an adequate language.

Grothendieck spent a lot of energy in rebellious political causes, probably since in his youth he was an obedient bon eleve. On the other hand, Gelfand was already kicked out of high-school (for political reasons), so could focus all his rebellious energy on innovative math.

So even if you are not quite as smart or original as Gelfand and Grothendieck (and who is?), you will still be able to do math well into your nineties, if you follow Gelfand’s, rather than Grothendieck’s, example.

Zeilberger also seems to have a lot of respect for G. J. Chaitin, something that I certainly find very interesting. I mention this because I have been reading and re-reading Chaitin these days, especially after discovering some of his very recent work on meta-biology.

PS: Zeilgerber was featured in a BBC Documentary on infinity (not a great one, though) in which he talked about his ultrafinitist viewpoint.

________________

Onionesque Reality Home >>

Endre Szemerédi wins the Abel Prize

In absolutely big news, the Norwegian Academy of Science and Letters has made a fantastic decision by awarding the 2012 Abel Prize to Prof. Endre Szemeredi, one of the greatest mathematicians of our time. We must remember that such decisions are made by committees, and hence I would congratulate the Abel Committee (comprising of Ragni Piene, Terence Tao, Dave Donoho, M. S. Raghunathan and Noga Alon) for such an excellent decision !

Some months ago I told one of my über-cool-dude supervisors (Gabor) that Endre would win the Abel prize this year (guessing was no rocket science!)! I usually don’t like making such statements as there are always many great mathematicians who could win at any given time and there are a lot of other factors too. But Gabor actually told this to Endre, who ofcourse didn’t think it was serious. But apparently he did win it this year! A very well deserved award!

It’s pointless to make an attempt to talk about (not that I am competent to do so anyway) some of Prof. Szemeredi’s deep results and the resulting fundamental contributions to mathematics. Timothy Gowers wrote a good article on the same for the non-mathematical audience. Especially see a mention of Machine Learning on page 7. However, other than the Regularity Lemma that I find absolutely beautiful, my other favorite result of Szemeredi is his Crossing Lemma. A brief discussion on the Regularity Lemma in an older blog post can be found here.

An Irregular Mind: Szemeredi is 70. Book from Szemeredi’s 70th birthday conference recently.

For a short background Prof. Szemeredi was born in Budapest and initially studied at Eötvös before getting his PhD from Moscow State University, where he was advised by the legendary Soviet mathematician Israel Gelfand. He presently holds a position both at the Alfréd Rényi Institute of Mathematics and Rutgers and has had held visiting positions at numerous other places. Recently on his 70th birthday The János Bolyai Mathematical Society organized a conference in his honour, the proceedings of which were published as an appropriately titled book – “An Irregular Mind” (obviously a play on his “Regularity Lemma” and related work and as stated in the book “Szemerédi has an ‘irregular mind’; his brain is wired differently than for most mathematicians. Many of us admire his unique way of thinking, his extraordinary vision.”).

Congratulations to Endre Szemeredi and the great, absolutely unique Hungarian way of doing mathematics.

_________________________

See Also: Short Course on Additive Combinatorics focused on the Regularity Lemma and Szemeredi’s Theorem, Princeton University. (h/t Ayan Acharya)

Onionesque Reality Home >>

Importing the Szemerédi Regularity Lemma into Machine Learning

Synopsis of a recent direction of work with Gábor Sárközy, Endre Szemerédi and Fei Song — “The Regularity Lemma is a deep result from extremal graph theory having great utility as a fundamental tool to prove theoretical results, but can it be employed in more “practical” settings?”

More specifically we are interested in the problem of harnessing the power of the regularity lemma to do clustering. This blog post is organized as follows: We first sketch the regularity lemma, we then see that it is an existential predicate and state an algorithmic version, we then look at how this constructive version may be used for clustering/segmentation. It must be noted that the idea seems to have originated from an earlier interesting work by Sperotto and Pellilio.

Before a brief introduction to the regularity lemma, I’d quote Luca Trevisan on (the related) Szemeredi’s Theorem from his blog

Szemeredi’s theorem on arithmetic progressions is one of the great triumphs of the “Hungarian approach” to mathematics: pose very difficult problems, and let deep results, connections between different areas of math, and applications, come out as byproducts of the search for a solution.

Even though I am nothing close to being a mathematician, but we’ll see that even an Electrical Engineer can somewhat understand and appreciate what Prof. Trevisan means in this specific context! ;-)

I will attempt to sketch a few things below with more emphasis on intuition than rigour! To be specific – intuition and then rigour.

_________________________

Introduction: One of the most fundamental and ingenious results in graph theory and discrete mathematics is the Szemeredi Regularity Lemma. It was originally advanced by Endre Szemeredi as an auxiliary result to prove a long standing conjecture of Erdős and Turán from 1936 (on the Ramsey properties of arithmetic progressions), which stated that sequences of integers with postive upper density must contain arbitrarily long arithmetic progressions (now called as Szemeredi’s theorem (1978)). Now the regularity lemma by itself is considered as one of the most important tools in graph theory.

Prof. Endre Szemeredi: Originator of the "Regularity Method"

A very rough statement of the regularity lemma could be made as follows:

Every graph can be approximated by random graphs. This is in the sense that every graph can be partitioned into a bounded number of equal parts such that:
1. Most edges run between different parts
2. And that these edges behave as if generated at random.

The fundamental nature of the Regularity Lemma as a tool in Graph Theory (and beyond) could be understood by the fact that two recent Fields Medals (Timothy Gowers, 1998 and Terence Tao, 2006) were awarded atleast in part for achieving breakthrough results on the regularity lemma and using those to prove fundamental results on arithmetic progressions amongst other results (Such as the Green-Tao Theorem).

_________________________

A Temporary Digression and Prof. Trevisan’s Statement: A first look at the statement of the regularity lemma above reminds one (especially those trained in signal processing I suppose) of the Fourier series. While this is extremely simplified – It’s not entirely incorrect to say that it points in part to a deeper relationship between combinatorics and analysis. Let’s look at it this way – Szemeredi employed the regularity lemma (a graph theoretic method) to prove what is now called Szemeredi’s theorem. The Szemeredi Theorem is a difficult one and it has four different proofs (graph-theoretic, ergodic, hypergraph-theoretic and lastly based on Fourier analysis)*.  Infact work on the Szemeredi theorem has contributed in better understanding connections between these areas (as an example – Timothy Gowers was awarded the Fields Medal for proving very deep results on the connections between analysis and combinatorics).

Terence Tao and Timothy Gowers have made breakthrough contributions to the regularity lemma and Szemeredi's Theorem amongst numerous others (Image Source: Republic of Math)

So it could be said that it all really started with Erdős and Turán posing a hard problem – The search for it’s solution has led to not only some good theorems but also some interesting connections between different areas of mathematics and thus a better understanding of all of them as a result. So we see what Prof. Trevisan means!

*While here the connection appears very sketchy though intuitive. The strong relationship between the Regularity Lemma and Analysis was shown in a breakthrough paper by László Lovász and Balázs Szegedy. In this paper they showed that the regularity lemma could be thought of as a result in analysis

Szemeredi Regularity Lemma for the Analyst (László Lovász and Balázs Szegedy)

_________________________

Given the background now we get to stating Szemeredi’s Regularity Lemma:

The Szemeredi Lemma: Expanding upon the rough statement mentioned a couple of paragraphs above, we could be a little more specific and say that

The Lemma states that every graph could be partitioned into a bounded number of quasi-random bi-partite graphs, called regular pairs and a few left over edges.

Now for a more precise statement, we introduce some definitions and notation.

Definition 1: If $G = (V,E)$ is a graph  and $A, B$ are disjoint subsets of $V$. Then, we denote the count of number edges with one endpoint in $A$ and another in $B$ by $e(A,B)$. The density of edges between $A$ and $B$ can then be defined as:

$\displaystyle d(A,B) = \frac{e(A,B)}{|A||B|}$

This just defines the edge density in a bipartite graph.

Definition 2: The bipartite graph $G = (A, B, E)$ is called $\varepsilon$-regular if for every $X \subset A$ and $Y \subset B$ satisfying

$|X| > \varepsilon |A|, |Y| > \varepsilon |B|$

we would have $|d(X,Y) - d(A,B)| < \varepsilon$

This means that a bipartite graph is epsilon regular if we were to take any random subsets (of some minimum size) $X, Y$ of $A,B$ respectively and even then find that the edge density between these subsets is almost the same as the edge density in the original bipartite graph. In effect this implies that if a bipartite graph is epsilon regular then all the edges between the the two disjoint sets are distributed uniformly.

Definition 3: An equitable partition of the vertex set $V$ of a graph $G = (V,E)$ is a partition of $V$ such that all the classes $C_0, C_1, \dots, C_k$ are pairwise disjoint. And that all classes $C_i$ with $1 \leq i \leq k$ have the same cardinality. It is noteworthy that oftentimes the vertex set might not have a cardinality that could be equally divided into the said number of classes, thus $C_0$ is present for a technical reason: To ensure that all the classes have the same cardinality.

Definition 4: For every equitable partition $P$ of the vertex set $V$ of $G = (V,E)$ into classes $C_0, C_1, \dots, C_k$ we associate a measure called the potential or the index of the partition $P$, which is defined as:

$\displaystyle ind(P) = \frac{1}{k^2} \sum_{1 \leq r \leq s \leq k} d(C_r, C_s)^2$

This measure just defines how close a partition is close to a regular one.

Definition 5: An equitable partition $C_0, C_1, \dots, C_k$ of the vertex set of graph $G$ given by $V$, where $C_0$ is the exceptional set, is called $\varepsilon$-regular if $C_0 < \varepsilon |V|$ and all but $\varepsilon k^2$ of the pairs $(C_i,C_j)$ are $\varepsilon$-regular such that $1 \leq i < j \leq k$.

We are now in a position to state the regularity lemma:

Theorem 1: [Regularity Lemma]
For every positive $\varepsilon$ and positive integer m, there are positive integers $N = N(\varepsilon, m)$ and $M = M(\varepsilon, m)$ such that the following property holds: For all graphs $G$ with $|V|\geq N$, there is an $\varepsilon$-regular partition of $G(V,E)$ into $k+1$ classes such that $m\leq k \leq M$.

The beauty of the regularity lemma is in the point that that the approximation for any graph does not depend on the number of vertices it has, but only on the error in approximation (represented by $\varepsilon$).

In the proof of the regularity lemma we start with an initial partition (with low potential) and then iteratively refine the partition in such a way that the potential increases to a point such that the partition is epsilon regular. However, this leads to a major problem (the first of our concerns) – In refining the partitions iteratively the number of classes increases exponentially in each iteration. We then end up with a partition in which the number of classes is a tower function – usually an astronomical figure. Clearly, this does not work very well for graphs arising in practice. This same point is made by Prof. Trevisan:

[…] one may imagine using the Regularity Lemma as follows: to prove an approximate quantitative statement about arbitrary graphs, we only need to prove a related statement for the finite set of objects that approximate arbitrary graphs up to a certain level of accuracy. The latter, completely finite, goal may be performed via a computer search. Unfortunately, the size of the objects arising in the Regularity Lemma grow so astronomically fast in the approximation that this approach is completely impractical.

As an aside: For a long while it was thought that maybe a tower function is not necessary. However in a remarkable paper Timothy Gowers constructed graphs that demonstrated that functions of the tower type were indeed necessary.

In any case, how can we get around this problem for approximating most graphs so that it could be useful in applications such as clustering? A possible solution to this problem in the context of clustering was proposed by Sperotto and Pelillo. They make some interesting points, however do not provide many details on their approach. We will get back to this problem in a short while.

But the first problem that we face is the following: The Szemeredi Lemma as originally proposed is an existential predicate. It does not give a method to obtain the regular partition for a given graph, but only says that one must exist! So if we are to use the Lemma in any practical setting, we need an algorithmic version. There now exist two algorithmic versions:

1. One proposed in a paper by Alon et al in 1992.

2. Another proposed by Frieze and Kannan and is based on the very intriguing relationship between singular values and regularity!

We for now, focus on the Alon et al. version.

_________________________

Algorithmic Version of the Regularity Lemma:

Theorem 2: [A Constructive Version of the Regularity Lemma – Alon et al.]

For every $\varepsilon > 0$ and every positive integer $t$ there is an integer $Q = Q(\varepsilon, t)$ such that every graph with $n > Q$ vertices has an $\varepsilon$-regular partition into $k + 1$ classes, where $t \le k \le Q$. For every fixed $\varepsilon > 0$ and $t \ge 1$ such a partition can be found in $O(M(n))$ sequential time, where $M(n)$ is the time for multiplying two $n$ by $n$ matrices with 0,1 entries over the integers. It can also be found in time $O(logn)$ on an ERPW PRAM with a polynomial number of parallel processors.

To really understand how the above theorem would require the introduction of several supporting definitions and lemmas (including one from Szemeredi’s original paper). Since our focus is on the application, we’d just state one lemma using which the idea behind the constructive version could be revealed in a more concrete sense.

Lemma 1: [Alon et al.] Let $H$ be a bipartite graph with equally sized classes $|A| = |B| = n$. Let $\displaystyle 2n^{\frac{-1}{4}} < \varepsilon <\frac{1}{16}$. There is an $O(M(n))$ algorithm that verifies that $H$ is $\varepsilon$-regular or finds two subset $A' \subset A$, $B' \subset B$, $\displaystyle |A'| \ge \frac{{\varepsilon}^4}{16}n$, $\displaystyle |B'| \ge \frac{{\varepsilon}^4}{16}n$, such that $\displaystyle |d(A, B) - d(A', B')| \ge \varepsilon^4$. The algorithm can be parallelized and implemented in $NC$.

This lemma basically says that whether or not a bipartite graph is epsilon regular is a question that can be answered very quickly. If it is – then we can have an algorithm to say so and if it is not, it should return the answer with a certificate or proof that it is not so. This certificate is nothing but subsets of the classes from the original bipartite graph. The idea of the certificate is to help to proceed to the next step.

The general idea in the Alon algorithm is:

Start with a random equitable partition of the graph $C_0, C_1, \dots, C_b$, where $\displaystyle |C_i| = \lfloor\ \frac{n}{b} ;\rfloor$ and $i = 1, 2, \dots, b$. Also $|C_0| < b$. Also let $k_1 = b$.

Then, for each pair $C_r, C_s$ in the partition $P_i$ check for regularity. If it is regular report so, if not find the certificate for the pair. If out of all the pairs, only $\varepsilon k_i^2$ are not epsilon regular – then halt, the partition $P_i$ is epsilon regular. If not then we refine this partition using the information gained from the certificates in the cases when the pairs not epsilon regular. On refining this partition we obtain a new partition $P'$ with $1 + k_i4^{k_i}$ classes. We repeat this process till we hit a partition which is epsilon regular.

_________________________

Using the Constructive Version for Clustering:

A little earlier it was pointed out that there are two main difficulties in using the Szemeredi Lemma in practical settings. At one level, to use the lemma in a practical setting we need a constructive version. However the constructive versions are so designed that they work for all graphs. To illustrate how this is a problem, we quote a line from the algorithmic version based on singular values (Freize and Kannan):

The Algorithm finishes in atmost $O(\varepsilon^{-45})$ steps with an $\varepsilon$-regular partition

Now, consider what this implies – If $\varepsilon$ is $\frac{1}{16}$ (a typical value). Then the number of steps in which we are guaranteed to find a regular partition is 1.53249554 × 1054.

This is astonishingly large number. To make the lemma truly practical, we would have to give up the aim of making it work for all graphs. Instead we should be content that it works on most graphs. Most graphs would largely be graphs that appear in practice. Like was mentioned earlier some directions in this regard were provided by Sperotto and Pellilio.

Another problem (as was mentioned earlier too) was that the number of classes grows exponentially with each refinement step. We can not allow the number of classes to grow astronomically either because if we do so then consequently we would never be able to refine the partitions far enough. Here we would have to compromise on the generality again. Instead of allowing the number of classes to grow exponentially we could allow it to grow by a defined value in each iteration. This value could be chosen depending on a number of parameters, including the data set size.

Even in making such an approximation – the potential would always increase with each iteration albeit much slowly as compared to the methodology as originally described. Infact we would say that it would work for most graphs. In our work this approach seems to work quite well.

Given these two changes, what still remains is how could the lemma be used for clustering datasets? One possible way is suggested by another result called the Key Lemma and the implication that it might have. This is stated below:

Lemma 2: Given an arbitrary graph $G=(V,E)$, a partition of $V$ in $k$ clusters as in regularity lemma described above and two parameters $\varepsilon, d$, we describe the reduced graph $G^R$ as the graph whose vertices are associated to the clusters and whose edges are associated to the $\varepsilon$-regular pairs with density above $d$. If we have a coloring on the edges of $G$, then the edges of the reduced graph will be colored with a color that appears on most of the edges between the two clusters.

Using the properties of what is called the Key-Lemma/Blow-Up Lemma, it is understood that this reduced graph should retain properties of the original graph. And thus any changes made on this reduced graph would reflect on the original graph.

Thus, one possible way to cluster is using a two-phase clustering strategy: First, we use the modified regularity lemma to construct a reduced graph using the method above. In the second phase, we use a pairwise clustering method such as spectral clustering to do cluster the the reduced graph. The results obtained on this reduced graph are then projected back onto the original graph using Lemma 2.

Two Phase Strategy to use the Regularity Lemma

Infact such a methodology gives results quite encouraging as compared to other standard clustering methods. These results would be reported here once the paper under consideration is published. :) Another thing to be noted is that the reduced graph is usually quite small as compared to the original graph and working on this smaller – reduced graph is much faster.

_________________________

Other Possible Changes:

1. The Alon et al. algorithm is used and modified to use the regularity lemma for clustering. It would be interesting to explore the results given by the Frieze & Kannan method as it has a different way to find the certificate.

2. There has been some work on the sparse regularity lemma. This is something that would have a lot of practical value as in the above we construct a dense graph. Using the sparse version would allow us to use nearest neighbor graphs instead of dense graphs. This would reduce the computational burden significantly.

3. A recent paper by Fischer et al.Approximate Hypergraph Partitioning and Applications” has received a lot of attention. In this paper they give a new approach for finding regular partitions. All the previous ones work to find partitions of the tower type, while this paper gives a method to find a smaller regular partition if one exists in the graph. Employing this methodology for refinement instead of using an approximate version of the algorithmic regularity lemma could be a fruitful direction of work.

_________________________

Onionesque Reality Home >>

Blind Geometers

This post is of general interest.

I was reading Prof. Alexei Sossinsky ‘s coffee table book on KnotsKnots: Mathematics with a Twist*, and it mentioned a couple of interesting cases of blind mathematicians. These couple of cases ignited enough interest to publish an old draft on blind mathematicians albeit now with a different flavor.

*(Note that the book has poor reviews on Amazon which I honestly don’t relate to. I think the errors reported in the reviews have been corrected plus the book is extremely short ~ 100 pages and hence actually readable on a few coffee breaks)

Sossinsky’s book gives an example of Antoine’s Necklace:

Antoine’s Necklace: A Wild Knot

Antoine’s Necklace is a Wild Knot that can be constructed as follows:

1. Start with a solid torus say $T_1$.

2. Place inside it four smaller tori linked two by two to make a chain. Let’s call this chain $T_2$.

3.  Inside each of the tori in step 2, construct a similar chain. This would be a set of 16 tori. Let’s call this $T_3$

4. Repeat this process ad-infinitum. The set obtained by the infinite set of Tori $T_i$ will be Antoine’s necklace.

$A = T_1 \cap T_2 \cap T_3 \cap \dotsb$

Antoine’s Necklace is not a mere curiosity and has very interesting properties. One would suppose that constructing such a structure would require considerable visualization, which is indeed true. However one of the most interesting things about this knot is that it was formulated and studied by Louis Antoine, who was blind. After he lost his eyesight, the famous mathematician Henri Lebesgue suggested to him that he study topology.

_______________

I have noticed (it is a common observation) that it is almost a rule that mathematicians who are blind are usually geometers/topologists. Such a correlation can not be mere coincidence.

Before reading Sossinsky’s book which also mentions G. Ya. Zuev as another influential blind topologist, the two best examples that I was aware of were L. S. Pontryagin and the great Leonhard Euler. Pontryagin is perhaps the first blind mathematician that I had heard of who made seminal contributions to numerous areas of mathematics (Algebraic Topology, Control Theory and Optimization to name a few). Some of his contributions are very abstract while some such as those in control theory are also covered in advanced undergrad textbooks (that is how I heard of him).

Lev Pontryagin (1908-1988)

Pontryagin lost his eyesight at the age of 14 and thus made all of his illustrious contributions (and learnt most of his mathematics) while blind. The case was a little different for Euler. He learnt most of his earlier mathematics while not blind. Born in 1707, he almost lost eyesight in the right eye in 1735. After that his eyesight worsened, losing it completely in 1766 to cataract.

Euler (1707-1783) on a Swiss Banknote

His mathematical productivity however actually increased, publishing more than half of his work after losing eyesight. Remarkably he published one paper each week in 1775 aided by students who doubled up as scribes. It is noteworthy that he is the most prolific mathematician to have ever lived in terms of number of pages published (Paul Erdős produced more papers), becoming one of the most influential mathematicians to have ever lived.

_______________

This excellent (as usual) Notices of the AMS article lists a few more famous blind mathematicians. Bernard Morin and Nicholas Suanderson to name a couple. Bernard Morin is famous for his work on sphere eversion (i.r homotopy, many youtube videos on this theme are available, video below).

Morin’s Surface

It is difficult to imagine for ordinary people that such work could be done by somebody who has been blind since age six. What could be the explanation for what I atleast consider an extraordinary and counter intuitive case?

Sossinsky in his book talks briefly of what he thinks about it and of some research in the area (though he doesn’t point out specific papers, it turns out there is a lot of interesting work on this aspect on spatial representation in blind people). He writes:

“It is not surprising at all that almost all blind mathematicians are geometers. The spatial intuition that sighted people have is based on the image of the world that is projected on their retinas; thus it is a two (and not three) dimensional image that is analysed in the brain of a sighted person. A blind person’s spatial intuition on the other hand, is primarily the result of tile and operational experience. It is also deeper – in the literal as well as the metaphorical sense. […]

recent biomathematical studies have shown that the deepest mathematical structures, such as topological structures, are innate, whereas finer structures, such as linear structures are acquired. Thus, at first, the blind person who regains his sight does not distinguish a square from a circle: He only sees their topological equivalence. In contrast, he immediately sees that a torus is not a sphere […]”

The Notices article has a line: “In such a study the eyes of the spirit and the habit of concentration will replace the lost vision”, referring to what is called as the Mind’s Eye commonly (i.e it is commonly believed that people with disabilities have some other senses magnified). Some of the work of the celebrated neuroscientist Oliver Sacks  (who I also consider as one of my role models. Movie buffs would recognize him from Dr Malcolm Sayer’s character in the fantastic movie Awakenings) talks of individuals in which this was indeed the case. He documents some of such cases in his book, The Mind’s Eye. He also notes that such magnification ofcourse does not happen in all of his patients but only in some fascinating cases.

The Mind’s Eye by Oliver Sacks (Click on image to view on Amazon)

Here in the video below (many more available on youtube) Dr Sacks describes some of such cases:

I wonder when we’d know enough. For such cases tell us something interesting about the brain, it’s adaptability, vision and spatial representation.

The Notices article also cites some examples of famous blind mathematicians who were not geometers, perhaps the more interesting cases if I could loosely put it that way.

_______________

Translation of the Article in Romanian:

Geometri Blind by Alexander Ovsov

Recommendations

1. The World of Blind Mathematicians – Notices of the AMS, Nov 2002  (pdf)

2. The Mind’s Eye – Oliver Sacks (Amazon)

3. Knots Mathematics with a Twist – Alexiei Sossinsky (Amazon)

5. Mathematical Reasoning and External Symbolic Systems – Catarina Dulith Novaes

_______________

Onionesque Reality Home >>

I had an occasion to read this fantastic book over the last couple of months. This book is a compilation by Hao Wang, a confidante of Kurt Gödel. Compiled over ten years it contains Gödel’s views on a wide array of areas. Some of these insights are little known, some are very interesting and some just show that Gödel was just as human in making errors.

This is an unadulterated chronicle of a brilliant mind. I don’t know what else to say to suggest this book. It is a must read for anyone with a remote interest in Mathematical Logic, Philosophy and Kurt Godel.

A Logical Journey - From Godel to Philosophy

____________

Here is an introduction to the book and the author.

Hao Wang (1921-1995) was one of the few confidantes of the great mathematician and logician Kurt Gödel. A Logical Journey is a continuation of Wang’s Reflections on Kurt Gödel and also elaborates on discussions contained in From Mathematics to Philosophy. A decade in preparation, it contains some important and unfamiliar insights into Gödel’s views on a wide range of issues, from Platonism and the nature of logic, to minds and machines, the existence of god, and positivism and phenomenology.

The impact of Gödel’s theorem on twentieth century thought is on a par with that of Einstein’s theory of relativity, Heisenberg’s uncertainty principle or Keynesian economics. These previously unpublished intimate and informal conversations, however, bring to light and amplify Gödel’s other major contributions to logic and philosophy. They reveal that there is much more in Gödel’s philosophy of mathematics than is commonly realized, and more in his philosophy than just a philosophy of mathematics.

Wang writes that “it is even possible that his quite informal and loosely structured conversations with me, which I am freely using in this book, will turn out to be the fullest existing expression of the diverse components of his inadequately articulated general philosophy”

I will leave you with some quotes from the book. Unless mentioned, they are by Kurt Gödel.

1. To develop the skill of correct thinking is in the first place to learn what you have to disregard. In order to go on, you have to know what to leave out: this is the essence  of effective thinking. (1972, from chapter 1 – Gödel’s life)

2. I would not say that one cannot polemicize against Nietzsche. But it should of course also be a writer [Dichter] or a person of the same type to do that. (17.2.48)

3. What you say about sadness is right : if there were a completely hopeless sadness, there would be nothing beautiful in it. But I believe there can rationally be no such thing. Since we understand neither why this world exists, nor why it is constituted exactly as it is, nor why we are in it, nor why we were born into exactly these and no other external relations: why then should we presume to know exactly this to be all  at there is no other world and that we shall never be in yet another one? (27.2.50)

4. One. cannot really say that complete ignorance is sufficient ground for hopelessness . If e.g. someone will land on an island completely unknown to him, it is just as likely that it is inhabited by harmless people as that it is by cannibals, and his ignorance gives no reason for hopelessness , but rather for hope. Your aversion against occult phenomena is of course well justified to the extent that we are here facing a hard-to-disentangle mixture of deception, credulousness and stupidity, with genuine phenomena. But the result (and the meaning) of the deception is, in my opinion, not to fake genuine phenomena but to conceal them. (3.4.50)

5. Is the book about Einstein really so hard to understand? I think that prejudice against and fear of every “abstraction” may also be involved here, and if you would attempt to read it like a novel (without wanting to understand right away everything at the 6rst reading), perhaps it would not seem so incomprehensible to you. (8.1.51)

6. As you know, I am indeed also thoroughly anti nationalistic, but one cannot, I believe, decide hastily against the possibility that people like Bismarck have the honorable intention to do something good. (7.11.56)

7. About the relation of art and kitsch we have, I believe, already discussed many times before. It is similar to that between light and heavy music. One could, however, hardly assert that all good music must be tragic? (23.3.57)

8. It is always enjoyable to see that there are still people who value a certain measure of idealism. (12.11.61)

9. Of all that we experience, there eventually of course remains only a memory, but just in this way all lasting things retain some of their actuality. (24.3.63)

10. She was not a beauty, but she was an extraordinarily intelligent person and had an extremely important role [in his life], because she was actually what one calls the life-line. She connected him to the earth. Without her, he could not exist at all.

A complicated marriage, but neither could exist without the other. And the idea that she should die before him was unthinkable for him. It is fortunate that he died before her. He was absolutely despondent when she was sick. He said, “Please come to visit my wife.”

She once told me, ‘I have to hold him like a baby.”  (1.4.2-3-4,  Alice Von Kahler on Adele and Kurt Godel)

Einstein and Godel at IAS, Princeton

11.The one man who was, during the last years, certainly by far Einstein‘s best friend, and in some ways strangely resembled him most was Kurt Godel,  the great logician. They were very different in almost every personal way- Einstein gregarious, happy, full of laughter and common sense, and Godel extremely solemn,very serious, quite solitary, and distrustful of common sense as a means of arriving at the truth. But they shared a fundamental quality: both went directly and wholeheartedly to the questions at the very center of things (in Holton and Elkena, Straus 1982:422).

12. Einstein has often told me that in the late years of his life he has continually sought Godel’s company in order to have discussions with him. Once he said to me that his own work no longer meant much, that he came to the Institute merely to have the privilege to walk home with Godel. [“The late years” probably began in 1951 , when Einstein stopped working on the unified theory. 1.6.2 Oskar Morgenstern]

13. The philosophers have only interpreted the world, in various ways, the point, however , is to change it. (Karl Marx, Theses on Feverbach 1845, Chapter 3)

14. The place which philosophy has occupied in Chinese civilization has been comparable to that of religion in other civilizations. In the world of the future, man will have philosophy in the place of religion. This is consistent with the Chinese tradition. It is not necessary that man should be religious, but it is necessary that he should be philosophical. When he is philosophical he has the very best of the blessings of religion. (Fung1 948:1, 6).

15. Engaging in philosophy is salutary in any case, even when no positive results emerge from it (and I remain perplexed). It has the effect  [Wirkung] that “the color [is] brighter,” that is, that reality appears more clearly as such. This observation reveals that , according to Godel’s conception , the study of philosophy helps us to see reality more distinctly , even though it may happen that no (communicable ) positive results come out of it to help others.

16. In presenting these conversations, you should pay attention to three principles: (1) deal only with certain points; (2) separate out the important and the new; and (3) pay attention to connections. Godel, 5 February 1976

17. The notion of existence is one of the primitive concepts with which we must begin as given. It is the clearest concept we have. Even “all”, as studied in predicate logic, is less clear, since we don’t have an overview of the whole world. We are here thinking of the weakest and the broadest sense of existence. For example, things which act are different from things which don’t. They all have existence proper to them. (4.4.12)

18. Existence: we know all about it, there is nothing concealed. The concept of existence helps us to form a good picture of reality. It is important for supporting a strong philosophical view and for being open-minded in reaching it. (4.4.13)

19. Power is a quality that enables one to reach one’s goals. Generalities contain the laws which enable you to reach your goals. Yet a preoccupation with power distracts us from paying attention to what is at the foundation of the world, and it fights against the basis of rationality. (4.4.14)

20. The world tends to deteriorate: the principle of entropy. Good things appear from time to time in single persons and events. But the general development tends to be negative. New extraordinary characters emerge to prevent the downward movement. Christianity was best at the beginning. Saints slow down the downward movement. In science, you may say, it is different. But progress occurs not in the sense of understanding the world, only in the sense of dominating the world, for which the means remains, once it is there. Also general knowledge though not in the deeper sense of first principles, has moved upwards. Specifically, philosophy tends to go down. (4.4.15)

21. The view that existence is useful but not true is widely held; not only in mathematics but also in physics, where it takes the form of regarding only the directly observable [by sense perception] as what exists. This is a prejudice of the time. The psychology behind it is not the implicit association of existence with time, action, and so on. Rather the association is with the phenomenon that consistent but wrong assumptions are useful sometimes. Falsity is in itself something evil but often serves as a tool for finding truth. Unlike objectivism, however, the false assumptions are useful only temporarily and intermediately. (4.4.16)

22. Einstein’s religion is more abstract, like that of Spinoza and Indian philosophy. My own religion is more similar to the religion of the churches. Spinoza’s God is less than a person. Mine is more than a person, because God can’t be less than a person. He can play the role of a person. There are spirits which have no body but can communicate with and influence the world. They keep [themselves] in the background today and are not known. It was different in antiquity and in the Middle Ages, when there were miracles. Think about deja vu and thought transference. The nuclear processes, unlike the chemical , are irrelevant to the brain.

23. The possible worldviews [can be divided] into two groups [conceptions]: skepticism, materialism and positivism stand on one [the left] side; spiritualism, idealism and theology on the other [the right]. The truth lies in the middle,or consists in a combination of these two conceptions. (Chapter 5)

24. Some reductionism is right: reduce to concepts and truths, but not to sense perceptions. Really it should be the other way around: Platonic ideas [what Husserl calls “essences ” and Godel calls “concepts”] are what things are to be reduced to. Phenomenology makes them [the ideas] clear. (5.3.15)

25. Introspection is an important component of thinking; today it has a bad reputation. Introspective psychology is completely overlooked today. Epoche concerns how introspection should be used, for example, to detach oneself from influences of external stimuli (such as the fashions of the day). Even the scientists (fashions of the day). Even the scientists [sometimes] do not agree because they are not [detached true] subjects [ in this sense].

26. Positivists decline to acknowledge any apriori knowledge. They wish to reduce everything to sense perceptions. Generally they contradict themselves in that they deny introspection as experience, referring to higher mental phenomena as “judgments”. They use too narrow a notion of experience and introduce an arbitrary bound on what experience is, excluding phenomenological experience. Russell (in his 1940 (Inquiry into Meaning and Truth]) made a more drastic mistake in speaking as if sense experience were the only experience we can find by introspection.

27. For approaching the central part of philosophy, there is good reason to confine one’s attention to reflections on mathematics. Physics is perhaps less well suited for this purpose; Newtonian physics would be better. (Chapter 9)

28. The meaning of the world is the separation of wish and fact. (Chapter 9)

29. Whole and part– partly concrete parts and partly abstract parts- are at the bottom of everything. They are most fmtdamental in our conceptual system. Since there is similarity, there are generalities. Generalities are just a fundamental aspect of the world. It is a fundamental fact of reality that there are two kinds of reality: universals and particulars (or individuals).

30. Zhi zhi wei zhi zhi, bu zhi wei bu zhi, shi zhi ye. (To know that you know when you do know and know that you do not know when you do not know: that is knowledge .) Confucius, Analects, 2: 17. (Epilogue).

____________

Onionesque Reality Home >>

N is a Number : A Portrait of Paul Erdős

N is a Number : A Portrait of Paul Erdős is one of the most delightful, endearing and probably one of the best documentaries I have seen on an individual. I have always regarded Paul  Erdős as one of my personal heroes and hence It seems weird that I had not seen this rather old documentary earlier. Especially given it’s extremely high quality, appeal and not to mention the character it is based on.  However, it is never late to discover something so good.

There is an extremely good wikipedia entry on Paul Erdős. However I would still write a few words on him before linking to the videos.

A Mathematician is a machine for turning coffee into theorems.

— An extremely famous quote attributed to Alfréd Rényi. It was originally intended for Hungarian mathematicians and the mathematical-circles culture that flourished there giving the world so many mathematical giants.

__________

Erdős was an extremely prolific and famously eccentric mathematician, producing more papers and collaborating with more people than anybody in history. His eccentricity made him an extremely lovable character, and he made a fair share in contributing to human comedy.

He had no home and no full time job, he traveled around the world for half a century. Surviving on living with collaborators, fees from lectures and other appearances. His dis-interest in anything carnal or materialistic was almost Zen like I would dare say. Just having two pairs of half empty suitcases as his only belongings as he moved along from one location onto another.

It is often said (and quite correctly) that if you finish all bees in the world, the world would not survive for long. We could use that as an allegory for the sciences /mathematics as well. Erdős was essentially a bee. Brilliant in many areas of mathematics, he traveled from place to place using one idea from one area into another, cross-pollinating them, generating interest with his lovable anecdotes and enriching Mathematics as a consequence. A welcome departure from the so called purists.

His mathematical output was so prolific that a tribute is the famous Erdős number that gives the collaborative distance of a mathematician with him. The reason for such astounding output was not just his love for only mathematics but a brilliant memory. Colleagues have remarked that he could remember problems discussed years ago and exactly what the details that were talked about. If a mathematical problem was left half way, he could still remember where was the point they stopped, even if revisited after years. Not just that, he had this knack of knowing the mind of other mathematicians in where their interests lay. So he knew who would like to work on what kind of problems.

Though Mathematics was his only love, his knowledge was extremely wide and he could talk with most people about most things they might be interested in. Almost educated in the classical European style, with interests spreading across other basic sciences, politics, history. literature etc.

His work was not rich just in quantity. He displayed an extremely good taste in choosing and posing problems. The solutions to some of which have resulted in entirely new areas of Mathematics. Paul Erdős had been a towering figure even while he was alive, but as more time passes by, he only grows taller.

__________

N is a Number : A Portrait of Paul Erdős – Videos

Total Runtime : 57 Minutes

[View Here]

– Based on the book “The man who only loves numbers” by Paul Hoffman.

– Made by George Paul Csicsery 1993

– Narrated by James Locker

– Music by Mark Adler (I have to mention the music as I thought it was pretty beautiful, especially towards the end)

__________

Hat Tip : To Dr Vitorino Ramos’ ever thoughtful blog

__________

Onionesque Reality Home >>