Tuesday, May 11, 2010

CS proMP: a Probabilistic Matching Pursuit Code


In the longest entry of the week, I made a brief mention of Probabilistic Matching Pursuit for Compressive Sensing by Atul Divekar and Okan Ersoy. As a result, Atul Divekar sent me the following:

Hello Igor,

Thank you for mentioning my report on probabilistic matching pursuit. At the risk of sounding like a shameless plug for the paper, I would like you to highlight its significance: It allows exact recovery when the number of nonzero coefficients S is upto M-1 for a M*N matrix with M less than N. The performance does not depend on the RIP property, but on a simpler measurement: the variance of the gram matrix.

We show recovery results for specific values of S,M,N which confirm this.

Also, the algorithm overcomes a fundamental weakness with previous matching pursuit algorithms such as OMP: on some problem instances, OMP will certainly fail. With the probabilistic approach, it will always succeed.

I would really appreciate it if you read the whole article and commented on it..
thanks
It's no shameless plug if something really good comes out of it and it looks like this is the case. So I went ahead and asked Atul to please provide a prototype code of the algorithm so that others could try it out and be convinced. He kindly obliged in providing the prototypes on his page:

I modified the two files so that proMP (proMP.m) can be called as a stand alone program and from pbp1 ( pbp.m).
A couple of points: the values of h and epsilon are part of the algorithms core parameters and should be left at their default values. (Unless someone wants to study the effect of changing them)

The least squares often does not return exact solutions, so even if all the bases have been found, the residue may be significant. So we compare it to a threshold, whose value should be much smaller than the smallest nonzero coefficient value. This usually works, but may fail if the value is too large/small by allowing a false solution or not detecting a true solution.

In other words, just download the two files and run pbp or pbp1. You may want to change the value of T to get better statistics. Atul also added:

One thing to note: In theory, testing that the residue magnitude equal to zero should be enough to identify that the true bases have been found. But I find that the least squares leaves behind rounding errors sometimes even if it includes all the known bases. Also, I wanted to avoid counting any false solutions (ie residue equal to zero when all the true bases are not selected). So in the simulation I count the actual number of true bases found.

In the report, the last section gives a theoretical proof why recovery is possible even for the 2S greater than M case.

Also, there are a couple of typos in the report which I noticed after it went online, the mean number of iterations should be sum 1/p_{j}, not sum p_{j} and the expression for p_{j} should have 1-(S-j)/N, not (S-j+1).

When looking at the claims, I wonder how the Donoho-Tanner phase transition would change/shift as a result of using this solver. I am going to add it in Compressive Sensing Big Picture reconstruction code list.

Thanks Atul

Credit: NASA, Test of the launch escape system for the upcoming Orion manned spacecraft on May 6, 2010.

1 comment:

Anonymous said...

I think it is worth pointing out the difference between claims such as those made in this paper and those of, say Tanner and Donoho.

We should not forget what CS is all about. Sparse recovery is NP-hard. However, under certain additional conditions, we can solve the problem with efficient methods (such as l1 minimization).

There are then two questions, 1) When can we solve the problem at all (even though we might have to use combinatorial algorithms) and 2) under which conditions can we use faster methods.

The answer to 1 is well known for years. The 2-times sparsity result is a worst case limit and it has been pointed out before (see for example Fletcher, Rangan, and Goyal), that, if we have randomness in the sparse signal, then we only need sparsity + 1 observations (with probability 1). This is the result rediscovered by Atul here.

The question is then how random MP can solve the NP-hard problem. Well it can, but there is no guarantee it can do this in polynomial time. In fact, as the algorithm has a non-zero probability of selecting any of the subsets, there is a possibility of selecting the correct one. Unfortunately, this probability might be tiny and, as far as I can see, the algorithm might still require exponentially many iterations until it finds the optimum.

Printfriendly