Over the past 4-5 months whenever there is some time to spare, I have been working through The Cauchy-Schwarz Master Class by J. Michael Steele. And, although I am still left with the last two chapters, I have been reflecting on the material already covered in order to get a better perspective on what I have been slowly learning over the months. This blog post is a small exercise in this direction.
Ofcourse, there is nothing mysterious about proving the Cauchy-Schwarz inequality; it is fairly obvious and basic. But I thought it still might be instructive (mostly to myself) to reproduce some proofs that I know out of memory (following a maxim of my supervisor on a
white/blackboard blogpost). Although, why Cauchy-Schwarz keeps appearing all the time and what makes it so useful and fundamental is indeed quite interesting and non-obvious. And like Gil Kalai notes, it is also unclear why is it that it is Cauchy-Schwarz which is mainly useful. I feel that Steele’s book has made me appreciate this importance somewhat more (compared to 4-5 months ago) by drawing to many concepts that link back to Cauchy-Schwarz.
Before getting to the post, a word on the book: This book is perhaps amongst the best mathematics book that I have seen in many years. True to its name, it is indeed a Master Class and also truly addictive. I could not put aside the book completely once I had picked it up and eventually decided to go all the way. Like most great books, the way it is organized makes it “very natural” to rediscover many susbtantial results (some of them named) appearing much later by yourself, provided you happen to just ask the right questions. The emphasis on problem solving makes sure you make very good friends with some of the most interesting inequalities. The number of inequalities featured is also extensive. It starts off with the inequalities dealing with “natural” notions such as monotonicity and positivity and later moves onto somewhat less natural notions such as convexity. I can’t recommend this book enough!
Now getting to the proofs: Some of these proofs appear in Steele’s book, mostly as either challenge problems or as exercises. All of them were solvable after some hints.
Proof 1: A Self-Generalizing proof
This proof is essentially due to Titu Andreescu and Bogdan Enescu and has now grown to be my favourite Cauchy-Schwarz proof.
We start with the trivial identity (for ):
Expanding we have
Rearranging this we get:
Rearranging this we get the following trivial Lemma:
Notice that this Lemma is self generalizing in the following sense. Suppose we replace with and with , then we have:
But we can apply Lemma 1 to the second term of the right hand side one more time. So we would get the following inequality:
Using the same principle times we get the following:
Now substitute and to get:
This is just the Cauchy-Schwarz Inequality, thus completing the proof.
Proof 2: By Induction
Again, the Cauchy-Schwarz Inequality is the following: for
For proof of the inequality by induction, the most important thing is starting with the right base case. Clearly is trivially true, suggesting that it is perhaps not of much import. So we consider the case for . Which is:
To prove the base case, we simply expand the expressions. To get:
Which is just:
Which proves the base case.
Moving ahead, we assume the following inequality to be true:
To establish Cauchy-Schwarz, we have to demonstrate, assuming the above, that
So, we start from :
we further have,
Now, we can apply the case for . Recall that:
Thus, using this in the R. H. S of , we would now have:
This proves the case on assuming . Thus also proving the Cauchy-Schwarz inequality by the principle of mathematical induction.
Proof 3: For Infinite Sequences using the Normalization Trick:
We start with the following question.
Problem: For If and then is ?
Note that this is easy to establish. We simply start with the trivial identity which in turn gives us
Next, take and on summing up to infinity on both sides, we get the following:
From this it immediately follows that
; substituting in , we get:
Which simply gives back Cauchy’s inequality for infinite sequences thus completing the proof:
Proof 4: Using Lagrange’s Identity
We first start with a polynomial which we denote by :
The question to now ask, is ? To answer this question, we start of by re-writing in a “better” form.
Next, as J. Michael Steele puts, we pursue symmetry and rewrite the above so as to make it apparent.
Thus, we now have:
This makes it clear that can be written as a sum of squares and hence is always postive. Let us write out the above completely:
Now, reversing the step we took at the onset to write the L.H.S better, we simply have:
This is called Lagrange’s Identity. Now since the R.H.S. is always greater than or equal to zero. We get the following inequality as a corrollary:
This is just the Cauchy-Schwarz inequality, completing the proof.
Proof 5: Gram-Schmidt Process gives an automatic proof of Cauchy-Schwarz
First we quickly review the Gram-Schmidt Process: Given a set of linearly independent elements of a real or complex inner product space , . We can get an orthonormal set of elemets by the simple recursion (after setting ).
Keeping the above in mind, assume that . Now let . Thus, we have:
Giving: . Rearranging we have:
where are constants.
Now note that: and
. The following bound is trivial:
. But note that this is simply
Which is just the Cauchy-Schwarz inequality when .
Proof 6: Proof of the Continuous version for d =2; Schwarz’s Proof
For this case, the inequality may be stated as:
Suppose we have and that and . Then consider the double integrals:
, and . These double integrals must satisfy the following inequality:
The proof given by Schwarz as is reported in Steele’s book (and indeed in standard textbooks) is based on the following observation:
The real polynomial below is always non-negative:
unless and are proportional. Thus from the binomial formula we have that , moreover the inequality is strict unless and are proportional.
Proof 7: Proof using the Projection formula
Problem: Consider any point in . Now consider the line that passes through this point and origin. Let us call this line . Find the point on the line closest to any point .
If is the point on the line that is closest to , then it is given by the projection formula:
This is fairly elementary to establish. To find the value of , such that distance is minimized, we can simply consider the squared distance since it is easier to work with. Which by definition is:
which is simply:
So, the value of for which the above is minimized is . Note that this simply reproduces the projection formula.
Therefore, the minimum squared distance is given by the expression below:
Note that the L. H. S is always positive. Therefore we have:
Rearranging, we have:
Which is just Cauchy-Schwarz, thus proving the inequality.
Proof 8: Proof using an identity
A variant of this proof is amongst the most common Cauchy-Schwarz proofs that are given in textbooks. Also, this is related to proof (6) above. However, it still has some value in its own right. While also giving an useful expression for the “defect” for Cauchy-Schwarz like the Lagrange Identity above.
We start with the following polynomial:
. Clearly .
To find the minimum of this polynomial we find its derivative w.r.t and setting to zero:
Clearly we have . We consider:
, substituting we have:
Just rearrangine and simplifying:
This proves Cauchy-Schwarz inequality.
Now suppose we are interested in an expression for the defect in Cauchy-Schwarz i.e. the difference . For this we can just consider the L.H.S of equation since it is just .
i.e. Defect =
Which is just:
This defect term is much in the spirit of the defect term that we saw in Lagrange’s identity above, and it is instructive to compare them.
Proof 9: Proof using the AM-GM inequality
Let us first recall the AM-GM inequality:
For non-negative reals we have the following basic inequality:
Now let us define and
Now consider the trivial bound (which gives us the AM-GM): , which is just . Note that AM-GM as stated above for is immediate when we consider and
Using the above, we have:
Summing over , we have:
But note that the L.H.S equals 1, therefore:
Writing out and as defined above, we have:
Thus proving the Cauchy-Schwarz inequality.
Proof 10: Using Jensen’s Inequality
We begin by recalling Jensen’s Inequality:
Suppose that is a convex function. Also suppose that there are non-negative numbers such that . Then for all for one has:
Now we know that is convex. Applying Jensen’s Inequality, we have:
Now, for for all , let and let .
Rearranging this just gives the familiar form of Cauchy-Schwarz at once:
Proof 11: Pictorial Proof for d = 2
Here (page 4) is an attractive pictorial proof by means of tilings for the case by Roger Nelson.