## Math blogs vs CS blogs

March 31, 2008

I am very impressed by many professional mathematicians posting expository articles on their blog. One of the great advantages it has over just a survey article is that the comments section allows direct interaction with the author. Also, in many cases the presentation is more relaxed than a research paper and more fast paced than a book. I have examples of at least three blogs : Terry Tao’s blog , Rigorous Trivialities and The Unapologetic Mathematician , with regular detailed postings on various topics. I am wondering why there arent so many CS theory blogs? Luca’s blog is a great one that stands out in this regard. There has a been a lot of discussion about selling CS theory, see for example this post by Lance Fortnow. I think that one of the ways to do this is to write good expository articles and popularize various ideas that arise again and again in theoretical CS.

## Party Snaps

March 19, 2008

Some photographs from a party on March 8th. Unfortunately I do not have some snaps of the delicious food cooked by Neha. However I am sure Amit has taken some snaps and I will ask him to send me a link to those.

Update 03/20/2008 : I have uploaded some more photos to the same folder on flickr. These were taken by Amit.

## Finding Percentiles

March 19, 2008

Suppose you are given a multiset of real numbers $A=\{a_1,a_2,\cdots,a_n\}$. Given a number $b$, one can ask the question : Find the fraction of points in $A$ that are less than or equal to $b$. Here is a simple randomized algorithm to solve this.

 1. Choose $K$ elements of $A$ uniformly at random. Sort the elements. Let $B$ be the sorted multiset. 2. Find out the fraction $g$ of points in the new multiset $B$ that are less than or equal $b$. 3. Output $g$

How good is this algorithm? Here is an attempt at an analysis. For a number $x$ denote by $F(x,A)$ the fraction of elements in $A$ less than or equal to $x$. Suppose $F(b,A) = f$. Suppose $F(b,B) = g$. Then $g|B| = K$ of the elements were chosen from $f|A|$ elements and $(1-g)|B| = (|B| - K)$ of the elements were chosen from $(1-f)|A|$. Hence the probability is $\binom{|B|}{K} \cdot K^f \cdot (|B|-K)^(1-f)$. Therefore $K$ has a binomial distribution $B(|B|,f)$. The expected value of $K$ is $f|B|$ and the standard deviation is $\sqrt{|B|f(1-f)}$. By the The probability that we become wrong by $\delta$ is $Pr[|g - f| > \delta f]$. By using a modified form of the Hoeffding’s bound the probability is upper bounded by $2e^{-\frac{\delta^2 |B| f}{2} }$. If $f > 0$ then how large should $|B|$ be so that we are at most off by 1% of $f$ with probability $0.99$? A simple calculation shows that if $|B| > 1060/f$ then we are good. If The nice part about use of this bound is that it does not depend on the size of the underlying multiset $A$. However there is a dependance on $f$ that is unknown in the first place! However if we assume $f \geq 0.1$ then we can get away with sampling a few thousand elements. Are there any better known algorithms? In randomized algorithms that produce $0-1$ answers it is possible to increase the probability of success by repeating algorithm multiple times and then taking the majority of the output as the true answer. Is there a similar way of improving the overall accuracy of algorithms that produce one of many possible values such as the percentile algorithm?

## Superconcentrators From Expanders

March 7, 2008

Today I will be blogging about the construction of superconcentrators from expanders. This is directly from the expanders survey. I am just reviewing it here for my own learning. First some definitions.
Definition 1. A $d-$ regular bipartitie graph $(L,R,E)$, with $|L|=n$ and $|R| =m$ is called a $\emph{magical graph}$ if it saisfies the properties below. For a given set $S$ of vertices, we denote the set of vertices to which some vertex in $S$ is connected as $\Gamma(S)$.

1. For every $S$ with $|S| \leq \frac{n}{10}, |\Gamma(S)| \geq \frac{5d}{8} |S|$
2. For every $S$ with $\frac{n}{10} < |S| \leq \frac{n}{2}$, $|\Gamma(S)| \geq |S|$.

Definition 2. A superconcentrator is a graph $G=(V,E)$ with two given subsets $I, O \subseteq S$ with $|I|=|O|=n$, such that for every $S \subseteq I$ and $T \subseteq O$ with $|S|=|T|=k$, the number of disjoint paths from $S$ to $T$ is at least $k$.

Superconcentrators with $O(n)$ edges are interesting for various reasons which we do not go into here.

But we do give a construction of a superconcentrator with $O(n)$ edges from magical graphs above. A simple probabilistic argument can show the following result.

Theorem There exists a constant $n_0$ such that for every $d \geq 32$, such that $n \geq m,n_0 \geq \frac{3n}{4}$, there is a magical graph with $|L|=n,|R|=m$.

Here is the construction of a superconcentrator from magical graphs. Assume that we can construct a superconcentrator with $O(m)$ edges for every $m \leq (n-1)$. The construction is recursive. First take two copies of the magical graphs $(L_1,R_1,E_1), (L_2,R_2,E_2)$ with $|L_1|=|L_2|=n$ and $|R_1|=|R_2|=\frac{3n}{4}$. Connect every vertex of $L_1$ to the corresponding vertex of $L_2$ and add edges between $R_1=I$ and $R_2=O$ so that the graph becomes a superconcentrator with the size of the input vertex set as $\frac{3n}{4}$. We claim that the resulting graph is a superconcentrator with input vertex set of size $n$.

Identify the input vertex set as $L_1$ and the output vertex set as $L_2$. For every $| S \subseteq L_1 | = k \leq \frac{n}{2}$ it is true that $| \Gamma(S) \cap R_1 | \geq |S|$. Therefore by Halls Marriage Theorem there exists a perfect matching between vertices of $S$ and $\Gamma(S) \cap R_1$. Similarly there exists a perfect matching between vertices of $T \subseteq L_2$ and $\Gamma(T) \cap R_2$. Together with the edges of the superconcentrator between input set $R_1$ and output set $R_2$ there are at least $k$ disjoint paths between $S$ and $T$. It remains to handle the case of $k > \frac{n}{2}$. For $|S|=|T|=k > \frac{n}{2}$, there is a subset $U \subseteq S$ of vertices with $|U| \geq (k - \frac{n}{2})$ such that the vertices corresponding to $U$ in $L_2$ are in $T$. Edges between such vertices contribute to $(k- \frac{n}{2})$ disjoint paths between $S$ and $T$. The remaining $\frac{n}{2}$ disjoint paths exist as proved earlier. Hence we have a superconcentrator with input, output vertex sets of size $n$. How many edges are there in this graph? For the base case of this recursion, we let a superconcentrator with input output sets of size $n \leq n_0$ be the complete bipartite graph with $n_0 ^2$ edges. The following recursion counts the edges. Let $e(k)$ denote the number of edges in the superconcentrator as per this construction with input, output sets of size $k$. Then

$e(k) = k^2$ for $k \leq n_0$ and $e(k) = k + 2dk + e(\frac{3k}{4})$ for $k > n_0$. It can be easily seen that $e(n) \leq Cn$ for $C \geq max(n_0,4(2d+1))$.