In this post I will derive an oft used inequality in Theoretical computer science, using the Chernoff technique. The following function turns up in a few places in combinatorial geometry and other places.

Here we assume
since otherwise the function is trivially seen to be equal to
. Here we will show that,

Two easy interpretations of
are as follows. First, it counts the number of points in the hamming cube
whose hamming distance from the origin is at most
. This interpretation also allows us to derive the nice identity,

for any
. In the above expression we obey the convention
when
. To see this, interpret the above sum as the distance from a fixed point
whose distance from the origin is
. The
is the distance from
and the inner summation counts the number of points at distance
. Then noting that the hamming cube looks the same from every point so the number of points of distance at most
is the same from the origin or from
. Hence the identity.
This definition has a nice geometric interpretation because
is also the number of points from
inside any ball of radius
. The second interpretation is as follows. Consider the discrete probability space
with the measure
. Let
be
independent random variables
defined as
for
. Let
. Then we have,

for any
. As such we have,
.
We will use this interpretation to derive our inequality.
Lemma:
.
Proof: Our first observation is that we may take
. The function
is an increasing function of
. Denoting this by
and extending it to a function on
we have by logarithmic differentiation,

Thus for
we have
. On the other hand
trivially.
Thus assuming
is
. Now we have the following identity,

so it sufficies to upper bound
. We now use the Chernoff technique. For any
,
![\displaystyle \begin{array}{rcl} \Phi_d(m) &=& 2^m \Pr\{ X \geq m - d \} \\\\ &=& 2^m \Pr \{ e^{\lambda X} \geq e^{\lambda(m - d)} \} \\\\ &\leq& 2^m \frac{{\bf E}(e^{\lambda X})}{e^{\lambda(m - d)}} ~~ \text{[By Markov's Inequality]} \\\\ &=& 2^m \frac{\prod_{i=1}^m {\bf E}(e^{\lambda X_i})}{e^{\lambda(m-d)}} ~~ \text{[By Independence of the} ~ X_i] \end{array} \displaystyle \begin{array}{rcl} \Phi_d(m) &=& 2^m \Pr\{ X \geq m - d \} \\\\ &=& 2^m \Pr \{ e^{\lambda X} \geq e^{\lambda(m - d)} \} \\\\ &\leq& 2^m \frac{{\bf E}(e^{\lambda X})}{e^{\lambda(m - d)}} ~~ \text{[By Markov's Inequality]} \\\\ &=& 2^m \frac{\prod_{i=1}^m {\bf E}(e^{\lambda X_i})}{e^{\lambda(m-d)}} ~~ \text{[By Independence of the} ~ X_i] \end{array}](http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbegin%7Barray%7D%7Brcl%7D+%5CPhi_d%28m%29+%26%3D%26+2%5Em+%5CPr%5C%7B+X+%5Cgeq+m+-+d+%5C%7D+%5C%5C%5C%5C+%26%3D%26+2%5Em+%5CPr+%5C%7B+e%5E%7B%5Clambda+X%7D+%5Cgeq+e%5E%7B%5Clambda%28m+-+d%29%7D+%5C%7D+%5C%5C%5C%5C+%26%5Cleq%26+2%5Em+%5Cfrac%7B%7B%5Cbf+E%7D%28e%5E%7B%5Clambda+X%7D%29%7D%7Be%5E%7B%5Clambda%28m+-+d%29%7D%7D+%7E%7E+%5Ctext%7B%5BBy+Markov%27s+Inequality%5D%7D+%5C%5C%5C%5C+%26%3D%26+2%5Em+%5Cfrac%7B%5Cprod_%7Bi%3D1%7D%5Em+%7B%5Cbf+E%7D%28e%5E%7B%5Clambda+X_i%7D%29%7D%7Be%5E%7B%5Clambda%28m-d%29%7D%7D+%7E%7E+%5Ctext%7B%5BBy+Independence+of+the%7D+%7E+X_i%5D+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0)
Now for any
we have,

and so we have,

We try to optimize the above expression by finding the
which minimizes the expression above. Noting that
we have, by differentiation and observing the sign of the derivative, that the
minimizing the expression is given by,

and substituing that in the estimate we have,

The last step follows from the simple fact that,

