Projections onto a Convex Body

August 23, 2010

Given a point {x} \in {{\bf R}^d} and an affine subspace {H \neq {\bf R}^d}, the projection of {x} onto {H} is the point {P_H(x)} such that,

\displaystyle \begin{array}{rcl} \vert \vert x - P_H(x) \vert \vert = \min \{ \vert \vert x y \vert \vert ~ \mid y ~ \in H \} \end{array} .

One can obviously define other notions of projection but the above is probably the most commonly used in geometry. If {x \notin H} then the projected point {P_H(x)} is the unique point in {H} such that the vector {P_H(x) - x} is orthogonal to {H}. Also is well known that projecting any segment to an affine subspace can only shrink its length. The proofs of these facts are easy to see. But in fact these facts are just corollaries of the following two results.

Given any nonempty closed convex set {C \subseteq {\bf R}^d} and a point {x \in {\bf R}^d}, there is a unique point {P_C(x) \in C} such that,

\displaystyle \begin{array}{rcl} \vert \vert x - P_C(x) \vert \vert = \min \{ \vert \vert x - y \vert \vert ~ \mid ~ y \in C\} \end{array} .


Let {C} be a nonempty closed convex set. The mapping {P_C : {\bf R}^d \rightarrow C} is a contraction i.e. for any {x,y \in {\bf R}^d} we have,

\displaystyle \begin{array}{rcl} \vert \vert P_C(x) - P_C(y) \vert \vert \leq \vert \vert x - y \vert \vert \end{array}

Applying these results to affine spaces (which are nonempty closed convex sets) yields the results mentioned earlier. This projection that maps x to P_C(x) is known as the metric projection. The proofs of these facts are in order. Read the rest of this entry »

Bounding the Volume of Hamming Balls

August 13, 2010

In this post I will derive an oft used inequality in Theoretical computer science, using the Chernoff technique. The following function turns up in a few places in combinatorial geometry and other places.

\displaystyle \begin{array}{rcl} \Phi_d(m) = \sum_{i=0}^d {m \choose i}. \end{array}

Here we assume {m \geq d} since otherwise the function is trivially seen to be equal to {2^m}. Here we will show that,

\displaystyle \begin{array}{rcl} \Phi_d(m) \leq \left ( \frac{me}{d} \right )^d \end{array}

Two easy interpretations of {\Phi_d(m)} are as follows. First, it counts the number of points in the hamming cube {{\bf H}^m = \{0,1\}^m} whose hamming distance from the origin is at most {d}. This interpretation also allows us to derive the nice identity,

\displaystyle \begin{array}{rcl} \Phi_d(m) = \sum_{h=0}^d \sum_{l = 0}^h {s \choose l}{m - s \choose h - l}. \end{array}

for any {0 \leq s \leq m}. In the above expression we obey the convention {{p \choose q} = 0} when {p < q}. To see this, interpret the above sum as the distance from a fixed point {p \in {\bf H}^m} whose distance from the origin is {s}. The {h} is the distance from {p} and the inner summation counts the number of points at distance {h}. Then noting that the hamming cube looks the same from every point so the number of points of distance at most {d} is the same from the origin or from {p}. Hence the identity.

This definition has a nice geometric interpretation because {\Phi_d(m)} is also the number of points from {{\bf H}^n} inside any ball of radius {d}. The second interpretation is as follows. Consider the discrete probability space {\Omega = \{ 0, 1 \}} with the measure {p(\{0\}) = p(\{1\}) = 1/2}. Let {X_1, X_2, \dots, X_m} be {m} independent random variables {X_i : \Omega \rightarrow \{0,1\}} defined as {X_i(y) = y} for {y \in \Omega}. Let {X = \sum X_i}. Then we have,

\displaystyle \Pr\{ X_1 = a_1, X_2 = a_2, \dots, X_m = a_m \} = \frac{1}{2^m},

for any {a_i \in \{0,1\}}. As such we have,

\displaystyle \frac{\Phi_d(m)}{2^m} = \Pr\{ X \leq d\}.

We will use this interpretation to derive our inequality.

Lemma: {\Phi_d(m) \leq \left ( \frac{me}{d} \right )^d}.

Proof: Our first observation is that we may take {d \leq m/2}. The function {\left ( \frac{me}{d} \right )^d} is an increasing function of {d}. Denoting this by {f(d)} and extending it to a function on {[0,m]} we have by logarithmic differentiation,

\displaystyle \begin{array}{rcl} \frac{f'(d)}{f(d)} = \ln \left ( \frac{m}{d} \right ) \geq 0 \end{array}

Thus for {d > m/2} we have {f(d) \geq f(m/2) \geq (2e)^{m/2} > 2^m}. On the other hand {\Phi_d(m) \leq 2^m} trivially.

Thus assuming {d} is {0 \leq d \leq m/2}. Now we have the following identity,

\displaystyle \begin{array}{rcl} \sum_{i=0}^d {m \choose i} = \sum_{j=m-d}^m {m \choose j} = 2^m \Pr\{ X \geq m-d \}. \end{array}

so it sufficies to upper bound {\Pr \{ X \geq m - d \}}. We now use the Chernoff technique. For any {\lambda > 0},

\displaystyle \begin{array}{rcl} \Phi_d(m) &=& 2^m \Pr\{ X \geq m - d \} \\\\ &=& 2^m \Pr \{ e^{\lambda X} \geq e^{\lambda(m - d)} \} \\\\ &\leq& 2^m \frac{{\bf E}(e^{\lambda X})}{e^{\lambda(m - d)}} ~~ \text{[By Markov's Inequality]} \\\\ &=& 2^m \frac{\prod_{i=1}^m {\bf E}(e^{\lambda X_i})}{e^{\lambda(m-d)}} ~~ \text{[By Independence of the} ~ X_i] \end{array}

Now for any {i} we have,

\displaystyle {{\bf E}(e^{\lambda X_i}) = \frac{e^{\lambda} + 1}{2}}

and so we have,

\displaystyle \begin{array}{rcl} \Phi_d(m) \leq \frac{(e^{\lambda} + 1)^m}{e^{\lambda(m-d)}} \end{array}

We try to optimize the above expression by finding the {\lambda} which minimizes the expression above. Noting that {m/2 \geq d} we have, by differentiation and observing the sign of the derivative, that the {\lambda} minimizing the expression is given by,

\displaystyle \begin{array}{rcl} e^{\lambda} = \frac{m}{d} - 1 \end{array}

and substituing that in the estimate we have,

\displaystyle \begin{array}{rcl} \Phi_d(m) &\leq& \frac{\left ( \frac{m}{d} \right )^m}{\left ( \frac{m-d}{d}\right )^{m-d}} \\ &=& \left ( 1 + \frac{d}{m-d} \right )^d \left ( \frac{m}{d} \right )^d \leq \left ( \frac{me}{d} \right )^d. \end{array}

The last step follows from the simple fact that,

\displaystyle \begin{array}{rcl} \left ( 1 + \frac{d}{m-d} \right )^{m - d} \leq \left ( e^{\frac{d}{m-d}} \right )^{m-d} = e^d. \end{array}


Jung’s Theorem

August 7, 2010

Given a set of points of diameter {D} in {n} dimensional Euclidean space {{\bf R}^n} it is trivial to see that it can be covered by a ball of radius {D}. But the following theorem by Jung improves the result by a factor of about {\frac{1}{\sqrt{2}}}, and is the best possible.
Theorem 1 [ Jung’s Theorem: ] Let {S} be a set of points in {{\bf R}^n} of diameter {D}. Then there is a ball of radius {\sqrt{\frac{n}{2n + 2}} D} covering {S}.
Proof: We first prove this theorem for sets of points {S} with {\vert S \vert \leq n + 1} and then extend it to an arbitrary point set. If {\vert S \vert \leq n + 1} then the smallest ball enclosing {S} exists. We assume that its center is the origin. Denote its radius by {r}. Denote by {S' \subseteq S} the subset of points such that {||p|| = r} for {p \in S'}. It is easy to see that {S'} is in fact non empty.

Observation: The origin must lie in the convex hull of {S'}. Assuming the contrary, there is a separating hyperplane {H} such that {S'} lies on one side and the origin lies on the other side of {H} (strictly). By assumption, every point in {S \setminus S'} is distance strictly less than {r} from the origin. Move the center of the ball slightly from the origin, in a direction perpendicular to the hyperplane {H} towards {H} such that the distances from the origin to every point in {S \setminus S'} remains less than {r}. However, now the distance to every point of {S'} is decreased and so we will have a ball of radius strictly less than {r} enclosing {S} which is a contradiction to the minimality of {r}.

Let {S' = \{ p_1, p_2,\dots, p_m\}} where {m \leq n \leq d+1} and because the origin is in the convex hull of {S'} so we have nonnegative {\lambda_i} such that,

\displaystyle \sum \lambda_i p_i = 0, \sum \lambda_i = 1

Fix a {k, 1 \leq k \leq m}. Then we have,
\displaystyle \begin{array}{rcl} 1 - \lambda_k &=& \sum_{i \neq k} \lambda_i \\ &\geq& \frac{1}{D^2}\sum_{i = 1}^m \lambda_i || p_i -p_k ||^2 \\ &=& \frac{1}{D^2}\left ( \sum_{i=1}^m \lambda_i (2r^2 - 2 \langle p_i, p_k \rangle) \right ) \\ &=& \frac{1}{D^2} \left ( 2r^2 - 2 \left \langle \sum_{i=1}^m \lambda_i p_i, p_k \right \rangle \right ) \\ &=& \frac{2r^2}{D^2} \end{array}

Adding up the above inequalities for all values of {k}, we get
\displaystyle \begin{array}{rcl} m - 1 \geq 2m r^2 / D^2 \end{array}

Thus we get {\frac{r^2}{D^2} \leq m - 1 / 2m \leq \frac{n}{2n + 2}} since {m \leq n + 1} and the function (x-1)/2x is monotonic. So we have immediately {r \leq \sqrt{\frac{n}{2n+2}} D}.
The remainder of the proof uses the beautiful theorem of Helly. So assume {S} is any set of points of diameter {D}. With each point as center draw a ball of radius {r = \sqrt{\frac{n}{2n+2}}D}. Clearly any {n + 1} of these balls intersect. This is true because the center of the smallest ball enclosing {n+1} of the points is at most {r} away from each of those points. So we have a collection of compact convex sets, any {n+1} of which have a nonempty intersection. By Helly’s theorem all of them have an intersection. Any point of this intersection can be chosen to be the center of a ball of radius {r} that will enclose all of {S}. \Box