Tensor Calculus II

July 10, 2008

In the last post I described a contravariant tensor of rank 1 and a covariant tensor of rank 1. In this post we will consider generalizations of these. We will introduce tensors of arbitrary rank (k,l) where k is the number of contravariant indices and l is the number of covariant indices. How many numbers does such a tensor represent? It is easy to see that if the tensor is defined in n dimensional space, it defines n^{(k+l)} real numbers for each point of space, and each coordinate system. The notation for such a tensor is

A^{i_1 i_2 \dots i_k}_{j_1 j_2 \dots j_l}

Now before we go ahead, I would like to clarify that the above represents just 1 single component of the tensor , out of the n^{(k+l)} components. The reason why I clarified this is that with tensors and the Einstein summation convention introduced in the last post, indices can sometimes be used as a summand. Specifically we said that if an index repeats in a tensor expression in a contravariant postion and a covariant position it stands for summation. You need to sum over that index for all possible values (1 to n). Also, multiple indices can repeat in a tensor expression. In that case one needs to sum over all such indices varying from 1 to n. Therefore a tensor expression in which 1 index is to be summed upon expands into n quantities and a tensor equation in which 2 indicies are to be summed upon expands into a sum of n^2 quantities. Guess what? armed with this understanding we are already ready to understand the general transformation law of tensors. Despair not my friends looking at this expression. If you do not understand this – dont worry. We will probably never see such complicated mixed tensors of general rank (k,l). However, it is necessary to understand how to compute with tensors and their transformation law. So after reading the general law of transformation and understanding what it means, I woul like you to forget it temporarily instead of getting bogged down by the weird equation! I will follow the definition with an example. This will be restricted to 3 dimensions and to our familiar spherical polar and cartesian coordinate systems. If you can understand the stuff in there, I would like you to remember the general computational method there. The general law of tensor transformation for a tensor of rank (k,l) is

\bar{A}^{i_1 i_2 \dots i_k}_{j_1 j_2 \dots j_l} = A^{u_1 u_2 \dots u_k}_{v_1 v_2 \dots v_l}\frac{\partial{\bar{x}^{i_1}}}{\partial{x^{u_1}}} \frac{\partial{\bar{x}^{i_2}}}{\partial{x^{u_2}}} \dots \frac{\partial{\bar{x}^{i_k}}}{\partial{x^{u_k}}} \cdot \frac{\partial{x^{v_1}}}{\partial{\bar{x}^{j_1}}} \frac{\partial{x^{v_2}}}{\partial{\bar{x}^{j_2}}} \dots \frac{\partial{x^{v_l}}}{\partial{\bar{x}^{j_l}}}

In this equation, the left hand side represents the value of 1 component out of the n^{(k+l)} components in the “bar” coordinate system [which is why we denote the A with a bar as the tensor]. The right hand side on the other hand is a huge sum of n^{(k+l)} quantities because of the repeating indices u_1, u_2, \dots, u_k, v_1, v_2, \dots, v_l. Each term of the summation is a product of a component of the tensor with the appropriate partial derivatives – that is, each term is a particular instantiation of the u_1, u_2, \dots, u_k, v_1, v_2, \dots, v_l. The transformation law for the general rank tensor is a direct generalization of the transformation law for rank 1 tensors. The tensors value is needed for the i_1,i_2,\dots,i_k,j_1,j_2,\dots,j_l index and the corresponding terms for the primed coordinate system occur in the numerator or denominator according to its for the i‘s – the contraviant indices or the j‘s – the covariant indices. The rest of the new introduced u_s, v_t have been summed upon since they are repeating indices.

Wow! We have stated the general transformation law and now we will proceed to an example. Before we do that, I would like to state our general roadmap in the future posts. Remember that our goal is to understand the equations of General Relativity. We are studying tensors just because we want to understand the weird symbols and equations in the profound equations laid down by Einstein. So in the next post, we will study general operations on tensors like – addition, subtraction, inner products etc. Along with this we will state some rules to recognize tensors. We will state a general rule by which we can ascertain that a set of number form the components of a tensor. We will use this general rule to prove that some collection of numbers are tensors in later posts. Here we will not however talk of the derivative of a tensor because that involves some more machinery – but hey .. without derivatives there is no calculus so get back we will! The next few posts will talk about the fundamental tensor which is related to measuring distance in space between two points. That will define define distance for us. In the next few posts we will then discuss derivatives of tensors. In doing so we will introduce some tensors in terms of the fundamental tensor. These will be helpful for defining derivatives. Armed with all this machinery about distances and derivatives, we will then state equations for geodesics which are shortest paths between 2 points in space. In ordinary space this is the straight line. But in spaces, where the fundamental tensor is more complicated the geodesics are not “straight” lines [Well by then we will probably wonder what straight means anyway!]. Finally we will have the notions of distance, derivatives and shortest paths in our bag so we will talk about what we really need to understand the Einstein equations – curvature. Specifically we will introduce quantities which represent how curved a space is at a point. After this we will state the Einstein equations and if I can manage I will show some simple consequences of these equations. The Einstein equations somehow relate the curvature of our playground [which is the “space” of 4 dimensional spacetime], to the distribution of mass and energy.

So much for the future posts – all that is probably going to take several posts! But for the time being lets get back to an example using familiar 3 space. In familiar 3 space, we are going to work with 2 coordinate systems – one : cartesian x^1=x, x^2=y, x^3=z and the other spherical polar \bar{x}^1=r, \bar{x}^2=\theta, \bar{x}^3 = \phi. Recall that for a point P, r is the distance from the origin O, 0 \leq \phi \leq \pi is the angle made by OP with the z axis and 0 \leq \theta < 2\pi is the angle made by the projection of OP onto the xy-plane with the x axis. Then it is easy to see that

x^1 = \bar{x}^1 \sin{\bar{x}^3} \cos{\bar{x}^2}

x^2=\bar{x}^1 \sin{\bar{x}^3} \sin{\bar{x}^2}

x^3=\bar{x}^1 \cos{\bar{x}^3}.

The inverse transformations equations are

\bar{x}^1 = \sqrt{ {x^1}^2 + {x^2}^2 + {x^3}^2}

\bar{x}^2=\tan^{-1}\left( \frac{x^2}{x^1}\right)

\bar{x}^3=\cos^{-1}\left( \frac{x^3}{\sqrt{ {x^1}^2 + {x^2}^2 + {x^3}^2}}\right)

See here for an illustration. Assume we are working in a small region of space where all the x^1,x^2,x^3 are nonzero so that \bar{x}^2,\bar{x}^3 are well defined. We also need to verify that the Jacobian matrix of the partial derivatives is non-singular but that is left as an exercise!

We will take a simple case here.


Suppose there is a contravariant tensor that has components \bar{A}^1, \bar{A}^2, \bar{A}^3. What are the components in the first coordinate system? For illustration we only show A^2. Now by the tensor transformation law

A^2 = \bar{A}^i \frac{\partial{x^2}}{\partial{\bar{x}^i}}

and so evaluating the partial derivatives and summing them up A^2 evaluates to

A^2=\bar{A}^1\sin{\bar{x}^2}\sin{\bar{x}^3} + \bar{A}^2\bar{x}^1\cos{\bar{x}^2}\sin{\bar{x}^3}+\bar{A}^3\sin{\bar{x}^2}\cos{\bar{x}^3}

Of course the above represents the value at a specific point in space. A contravariant tensor of rank 2 will have 9 such terms to sum up. We encourage the reader to evaluate the components of such a tensor.

Before the next post do think of some of the things you can do with tensors. Specifically think about

  1. When can two tensors be added/subtracted to produce new tensors?
  2. Can two tensors be multiplied to produce a new tensor?
  3. Consider a function f defined for each point of space. Suppose for each coordinate system (x^1,x^2,\dots, x^n) we define n components at a point by A_i= \frac{\partial{f}}{\partial{x^i}}. Prove that these numbers define a covariant tensor of rank 1.

Tensor Calculus – Part 1

May 12, 2008

Firstly, I am sorry for the outage in posts. Could not find time to write up anything worth a really good post. I have received some verbal requests for a new post. I appreciate all such requests! So without further ado, let me get down to business.

Today’s post will be about the Tensor Calculus . In fact I intend to write much more about them in subsequent posts but we will see how that goes. But before I go to give an introduction, let me state my motivation in studying tensors. While some familiarity with Physics is needed to understand the next paragraph, the tensor calculus is a purely mathematical topic.

The motivation comes from my long standing wish to be able to understand the General Theory of Relativity . The GTR is a geometrical theory of gravity. It explains gravity as a result of the deformation of SpaceTime . Mass and Energy cause SpaceTime to be distorted and curved. This distortion in turn acts back on mass to determine its motion. The perceived effects of gravitation on a particle are a direct result of the motion in the distorted SpaceTime without any mysterious force acting on the particle. That is quiet as much as I know about the GTR for now. But the basic mathematical tool to study the GTR is Reimannian Geometry . Reimannian Geometry is the appropriate generalization, for a general n-dimensional space, to study things like curves, surfaces, curvature, so common from everyday experience in two and three dimensional spaces. Many basic entities in Reimannian Geometry are best described by Tensors. Tensors are also used in many other places and in order to understand current physical theories and their mathematical formulations, one needs to know about tensors and basic differential geometry.

So, now lets get down to an introduction. Consider an n- dimensional space V_n. A coordinate system is an assignment of an n tuple of numbers x^i, i = 1 \cdots n to each point of V_n. In a different coordinate system the same point will have a different set of coordinates \bar{x}^i, i = 1 \cdots n. We also assume that the \bar{x}^i are given by n functions \varphi_i(x^j) of the x^i. The functions are assumed to be such that the Jacobian matrix \frac{\partial{\bar{x}^i}}{\partial{x^j}} is invertible.

Example. In 3 dimensional space, a standard cartesian coordinate system can be the first coordinate system, while the polar coordinates (r,\theta,\phi) can be the primed coordinate system. Or the primed coordinate system can be one with the 3 axes not necessarily at right angles to each other.

Definition. A contravariant tensor of rank 1 is a set of n functions A^i of the coordinates such that they transform according to the following law

\bar{A}^i = A^j \frac{\partial{\bar{x}^i}}{\partial{x^j}}

Now in the above expression you might notice that that the index j is repeated in the right hand side of the equation. That implies a summation . Therefore the expression actually means the following sum

\sum^n_{j=1} A^j \frac{\partial{\bar{x}^i}}{\partial{x^j}} .

In general, in writing tensorial expressions, the above so called summation convention is widely used. Whenever an index is used as a superscript and the same occurs as a subscript, or if it occurs in the numerator and the denominator it is to be summed on over all the coordinates.

The above definition was for a contravariant tensor of rank 1. In fact a contravariant tensor of rank 1 is just a vector. The tensorial transformation law just generalizes the transformation of vectors. To see this consider 3 dimensional space and basis vectors e_1, e_2, e_3. Suppose a vector v = a_1 e_1 + a_2 e_2 + a_3 e_3. Consider a different coordinate system with basis vectors e'_i = \sum_j a_ij e_j. Now in this coordinate system let v = a'_1 e'_1 + a'_2 e'_2 + a'_3 e'_3. Now what are the a'_i as functions of the a_i ? A quick calculation shows that a' = B a where a' is the row vector of coordinates a'_i, a is the row vector of coordinates a_i and B is the inverse of matrix A = (a_{ij}). Also as one can verify b_{ij} = \frac{\partial{\bar{x}^i}}{\partial{x^j}} .

Example. Consider the differentials dx^i. Clearly the differentials in the primed coordinate system are given by

d\bar{x}^i= dx^j \frac{\partial{\bar{x}^i}}{\partial{x^j}}

Next, we define a covariant tensor of rank 1 .

Definition. A covariant tensor of rank 1 is a set of n functions A_i that transform according to

\bar{A}_i = A_j \frac{\partial{x^j}}{\partial{\bar{x}^i}} .

In the next post we will go to the definition of more general tensors called mixed tensors. Such tensors are contravariant or covariant in more than one index or have mix of such indices. We will see the transformation law of such tensors. We will also start with some simple operations on tensors.

Math blogs vs CS blogs

March 31, 2008

I am very impressed by many professional mathematicians posting expository articles on their blog. One of the great advantages it has over just a survey article is that the comments section allows direct interaction with the author. Also, in many cases the presentation is more relaxed than a research paper and more fast paced than a book. I have examples of at least three blogs : Terry Tao’s blog , Rigorous Trivialities and The Unapologetic Mathematician , with regular detailed postings on various topics. I am wondering why there arent so many CS theory blogs? Luca’s blog is a great one that stands out in this regard. There has a been a lot of discussion about selling CS theory, see for example this post by Lance Fortnow. I think that one of the ways to do this is to write good expository articles and popularize various ideas that arise again and again in theoretical CS.

Party Snaps

March 19, 2008

Some photographs from a party on March 8th. Unfortunately I do not have some snaps of the delicious food cooked by Neha. However I am sure Amit has taken some snaps and I will ask him to send me a link to those.

Update 03/20/2008 : I have uploaded some more photos to the same folder on flickr. These were taken by Amit.

Finding Percentiles

March 19, 2008

Suppose you are given a multiset of real numbers A=\{a_1,a_2,\cdots,a_n\}. Given a number b, one can ask the question : Find the fraction of points in A that are less than or equal to b. Here is a simple randomized algorithm to solve this.

1. Choose K elements of A uniformly at random. Sort the elements. Let B be the sorted multiset.
2. Find out the fraction g of points in the new multiset B that are less than or equal b.
3. Output g

How good is this algorithm? Here is an attempt at an analysis. For a number x denote by F(x,A) the fraction of elements in A less than or equal to x. Suppose F(b,A) = f. Suppose F(b,B) = g. Then g|B| = K of the elements were chosen from f|A| elements and (1-g)|B| = (|B| - K) of the elements were chosen from (1-f)|A|. Hence the probability is \binom{|B|}{K} \cdot K^f \cdot (|B|-K)^(1-f). Therefore K has a binomial distribution B(|B|,f). The expected value of K is f|B| and the standard deviation is \sqrt{|B|f(1-f)}. By the The probability that we become wrong by \delta is Pr[|g - f| > \delta f]. By using a modified form of the Hoeffding’s bound the probability is upper bounded by 2e^{-\frac{\delta^2 |B| f}{2} }. If f > 0 then how large should |B| be so that we are at most off by 1% of f with probability 0.99? A simple calculation shows that if |B| > 1060/f then we are good. If The nice part about use of this bound is that it does not depend on the size of the underlying multiset A. However there is a dependance on f that is unknown in the first place! However if we assume f \geq 0.1 then we can get away with sampling a few thousand elements. Are there any better known algorithms? In randomized algorithms that produce 0-1 answers it is possible to increase the probability of success by repeating algorithm multiple times and then taking the majority of the output as the true answer. Is there a similar way of improving the overall accuracy of algorithms that produce one of many possible values such as the percentile algorithm?

Superconcentrators From Expanders

March 7, 2008

Today I will be blogging about the construction of superconcentrators from expanders. This is directly from the expanders survey. I am just reviewing it here for my own learning. First some definitions.
Definition 1. A d- regular bipartitie graph (L,R,E), with |L|=n and |R| =m is called a \emph{magical graph} if it saisfies the properties below. For a given set S of vertices, we denote the set of vertices to which some vertex in S is connected as \Gamma(S).

1. For every S with |S| \leq \frac{n}{10}, |\Gamma(S)| \geq \frac{5d}{8} |S|
2. For every S with \frac{n}{10} < |S| \leq \frac{n}{2}, |\Gamma(S)| \geq |S|.

Definition 2. A superconcentrator is a graph G=(V,E) with two given subsets I, O \subseteq S with |I|=|O|=n, such that for every S \subseteq I and T \subseteq O with |S|=|T|=k, the number of disjoint paths from S to T is at least k.

Superconcentrators with O(n) edges are interesting for various reasons which we do not go into here.

But we do give a construction of a superconcentrator with O(n) edges from magical graphs above. A simple probabilistic argument can show the following result.

Theorem There exists a constant n_0 such that for every d \geq 32, such that n \geq m,n_0 \geq \frac{3n}{4}, there is a magical graph with |L|=n,|R|=m.

Here is the construction of a superconcentrator from magical graphs. Assume that we can construct a superconcentrator with O(m) edges for every m \leq (n-1). The construction is recursive. First take two copies of the magical graphs (L_1,R_1,E_1), (L_2,R_2,E_2) with |L_1|=|L_2|=n and |R_1|=|R_2|=\frac{3n}{4}. Connect every vertex of L_1 to the corresponding vertex of L_2 and add edges between R_1=I and R_2=O so that the graph becomes a superconcentrator with the size of the input vertex set as \frac{3n}{4}. We claim that the resulting graph is a superconcentrator with input vertex set of size n.

Identify the input vertex set as L_1 and the output vertex set as L_2. For every | S \subseteq L_1 | = k \leq \frac{n}{2} it is true that | \Gamma(S) \cap R_1 | \geq |S|. Therefore by Halls Marriage Theorem there exists a perfect matching between vertices of S and \Gamma(S) \cap R_1. Similarly there exists a perfect matching between vertices of T \subseteq L_2 and \Gamma(T) \cap R_2. Together with the edges of the superconcentrator between input set R_1 and output set R_2 there are at least k disjoint paths between S and T. It remains to handle the case of k > \frac{n}{2}. For |S|=|T|=k > \frac{n}{2}, there is a subset U \subseteq S of vertices with |U| \geq (k - \frac{n}{2}) such that the vertices corresponding to U in L_2 are in T. Edges between such vertices contribute to (k- \frac{n}{2}) disjoint paths between S and T. The remaining \frac{n}{2} disjoint paths exist as proved earlier. Hence we have a superconcentrator with input, output vertex sets of size n. How many edges are there in this graph? For the base case of this recursion, we let a superconcentrator with input output sets of size n \leq n_0 be the complete bipartite graph with n_0 ^2 edges. The following recursion counts the edges. Let e(k) denote the number of edges in the superconcentrator as per this construction with input, output sets of size k. Then

e(k) = k^2 for k \leq n_0 and e(k) = k + 2dk + e(\frac{3k}{4}) for k > n_0. It can be easily seen that e(n) \leq Cn for C \geq max(n_0,4(2d+1)).

Computing Minimum Spanning Trees With Uncertainty

February 28, 2008

In this post we will partially discuss this paper by Erlebach et. al. The classical MST problem is studied for graphs with edge and vertex uncertainties. In general a problem with uncertainties can be specified as a triple (C,A,\phi) where C, the \emph{configuration} for the problem consists of a set of data points {c_1,c_2,\cdots,c_n} and A={A_1,A_2,\cdots,A_n} is a set of \emph{areas} where c_i \in A_i. The function \phi defines the solution for a particular configuration C. The input of the problem is just the set of areas A. An algorithm can \emph{update} any of the A_i and this determines the value of the data point c_i. The objective is to minimize the number of such updates and still be able to compute \phi(C). An algorithm is k update competitive if it can compute \phi(C) with at most k\cdot OPT+c updates where OPT is the minimum number of updates required by any algorithm.

Example: Given a weighted graph G=(V,E,w) the areas A_e, e \in E can be open intervals such that the edge weight w(e) \in A_e. The function \phi(C) can then be a minimum spanning tree of G.

The \emph{Witness} algorithm in this paper gives a general strategy to solve problems with uncertainties. The algorithm is k update competitive where k is an upper bound on the size of a witness set W. A witness set W is a set of areas such that in order to verify the solution \phi(C) at least one element of W needs to be updated. The witness algorithm is a kind of template for problems with uncertainties as it does not specify how a witness set is to be computed.

Question : In general is it always possible to compute a witness set in polynomial time?

The problem discussed in this paper is an extension of the concept of MST to graphs with uncertain edge weights. A MST of such a graph is said to be a MST in all possible realizations of edge weights, that satisfy the basic condition of the edge weight belonging to its assigned area. The paper provides an algorithm that is 2 update competitive when the areas are open intervals or singletons. Furthermore it is proved that, there is no (2-\epsilon) update competitive algorithm in this case. When there are no restrictions on the areas at all, it is proved that there is no constant update algorithm for the MST problem. The paper also considers graphs with vertex position uncertainties [where vertices are points in an Euclidean space].

The algorithm looks a lot like Kruskals MST algorithm. The main crux it to find witness sets of size 2 as edges are considered in their special sorted order. For details please refer to the paper.

Some comments and questions:

1) How about approximation algorithms for problems with uncertainties? The goal of an approximation algorithm would be to compute an approximate solution to problems with uncertainties with minimum updates.

2) In this model, there seems to be no restriction on the running time of the algorithm but the only importance seems to be given to the number of updates. Is there a model which combines the two meaningfully?

3) The notion of a MST that applies to all possible realizations of a graph with edge uncertainties seems too restrictive.