Math blogs vs CS blogs

March 31, 2008

I am very impressed by many professional mathematicians posting expository articles on their blog. One of the great advantages it has over just a survey article is that the comments section allows direct interaction with the author. Also, in many cases the presentation is more relaxed than a research paper and more fast paced than a book. I have examples of at least three blogs : Terry Tao’s blog , Rigorous Trivialities and The Unapologetic Mathematician , with regular detailed postings on various topics. I am wondering why there arent so many CS theory blogs? Luca’s blog is a great one that stands out in this regard. There has a been a lot of discussion about selling CS theory, see for example this post by Lance Fortnow. I think that one of the ways to do this is to write good expository articles and popularize various ideas that arise again and again in theoretical CS.

Party Snaps

March 19, 2008

Some photographs from a party on March 8th. Unfortunately I do not have some snaps of the delicious food cooked by Neha. However I am sure Amit has taken some snaps and I will ask him to send me a link to those.

Update 03/20/2008 : I have uploaded some more photos to the same folder on flickr. These were taken by Amit.

Finding Percentiles

March 19, 2008

Suppose you are given a multiset of real numbers A=\{a_1,a_2,\cdots,a_n\}. Given a number b, one can ask the question : Find the fraction of points in A that are less than or equal to b. Here is a simple randomized algorithm to solve this.

1. Choose K elements of A uniformly at random. Sort the elements. Let B be the sorted multiset.
2. Find out the fraction g of points in the new multiset B that are less than or equal b.
3. Output g

How good is this algorithm? Here is an attempt at an analysis. For a number x denote by F(x,A) the fraction of elements in A less than or equal to x. Suppose F(b,A) = f. Suppose F(b,B) = g. Then g|B| = K of the elements were chosen from f|A| elements and (1-g)|B| = (|B| - K) of the elements were chosen from (1-f)|A|. Hence the probability is \binom{|B|}{K} \cdot K^f \cdot (|B|-K)^(1-f). Therefore K has a binomial distribution B(|B|,f). The expected value of K is f|B| and the standard deviation is \sqrt{|B|f(1-f)}. By the The probability that we become wrong by \delta is Pr[|g - f| > \delta f]. By using a modified form of the Hoeffding’s bound the probability is upper bounded by 2e^{-\frac{\delta^2 |B| f}{2} }. If f > 0 then how large should |B| be so that we are at most off by 1% of f with probability 0.99? A simple calculation shows that if |B| > 1060/f then we are good. If The nice part about use of this bound is that it does not depend on the size of the underlying multiset A. However there is a dependance on f that is unknown in the first place! However if we assume f \geq 0.1 then we can get away with sampling a few thousand elements. Are there any better known algorithms? In randomized algorithms that produce 0-1 answers it is possible to increase the probability of success by repeating algorithm multiple times and then taking the majority of the output as the true answer. Is there a similar way of improving the overall accuracy of algorithms that produce one of many possible values such as the percentile algorithm?

Superconcentrators From Expanders

March 7, 2008

Today I will be blogging about the construction of superconcentrators from expanders. This is directly from the expanders survey. I am just reviewing it here for my own learning. First some definitions.
Definition 1. A d- regular bipartitie graph (L,R,E), with |L|=n and |R| =m is called a \emph{magical graph} if it saisfies the properties below. For a given set S of vertices, we denote the set of vertices to which some vertex in S is connected as \Gamma(S).

1. For every S with |S| \leq \frac{n}{10}, |\Gamma(S)| \geq \frac{5d}{8} |S|
2. For every S with \frac{n}{10} < |S| \leq \frac{n}{2}, |\Gamma(S)| \geq |S|.

Definition 2. A superconcentrator is a graph G=(V,E) with two given subsets I, O \subseteq S with |I|=|O|=n, such that for every S \subseteq I and T \subseteq O with |S|=|T|=k, the number of disjoint paths from S to T is at least k.

Superconcentrators with O(n) edges are interesting for various reasons which we do not go into here.

But we do give a construction of a superconcentrator with O(n) edges from magical graphs above. A simple probabilistic argument can show the following result.

Theorem There exists a constant n_0 such that for every d \geq 32, such that n \geq m,n_0 \geq \frac{3n}{4}, there is a magical graph with |L|=n,|R|=m.

Here is the construction of a superconcentrator from magical graphs. Assume that we can construct a superconcentrator with O(m) edges for every m \leq (n-1). The construction is recursive. First take two copies of the magical graphs (L_1,R_1,E_1), (L_2,R_2,E_2) with |L_1|=|L_2|=n and |R_1|=|R_2|=\frac{3n}{4}. Connect every vertex of L_1 to the corresponding vertex of L_2 and add edges between R_1=I and R_2=O so that the graph becomes a superconcentrator with the size of the input vertex set as \frac{3n}{4}. We claim that the resulting graph is a superconcentrator with input vertex set of size n.

Identify the input vertex set as L_1 and the output vertex set as L_2. For every | S \subseteq L_1 | = k \leq \frac{n}{2} it is true that | \Gamma(S) \cap R_1 | \geq |S|. Therefore by Halls Marriage Theorem there exists a perfect matching between vertices of S and \Gamma(S) \cap R_1. Similarly there exists a perfect matching between vertices of T \subseteq L_2 and \Gamma(T) \cap R_2. Together with the edges of the superconcentrator between input set R_1 and output set R_2 there are at least k disjoint paths between S and T. It remains to handle the case of k > \frac{n}{2}. For |S|=|T|=k > \frac{n}{2}, there is a subset U \subseteq S of vertices with |U| \geq (k - \frac{n}{2}) such that the vertices corresponding to U in L_2 are in T. Edges between such vertices contribute to (k- \frac{n}{2}) disjoint paths between S and T. The remaining \frac{n}{2} disjoint paths exist as proved earlier. Hence we have a superconcentrator with input, output vertex sets of size n. How many edges are there in this graph? For the base case of this recursion, we let a superconcentrator with input output sets of size n \leq n_0 be the complete bipartite graph with n_0 ^2 edges. The following recursion counts the edges. Let e(k) denote the number of edges in the superconcentrator as per this construction with input, output sets of size k. Then

e(k) = k^2 for k \leq n_0 and e(k) = k + 2dk + e(\frac{3k}{4}) for k > n_0. It can be easily seen that e(n) \leq Cn for C \geq max(n_0,4(2d+1)).