Wednesday, May 12, 2010

CS: In Compressive Sensing: P= NP + Additional Constraints


Yesterday's entry showed 0 comment when in fact there is one. So let me feature it here, anonymous tells us:

I think it is worth pointing out the difference between claims such as those made in this paper and those of, say Tanner and Donoho.

We should not forget what CS is all about. Sparse recovery is NP-hard. However, under certain additional conditions, we can solve the problem with efficient methods (such as l1 minimization).

There are then two questions,
  1. When can we solve the problem at all (even though we might have to use combinatorial algorithms) and
  2. under which conditions can we use faster methods.

The answer to 1 is well known for years. The 2-times sparsity result is a worst case limit and it has been pointed out before (see for example Fletcher, Rangan, and Goyal), that, if we have randomness in the sparse signal, then we only need sparsity + 1 observations (with probability 1). This is the result rediscovered by Atul here.

The question is then how random MP can solve the NP-hard problem. Well it can, but there is no guarantee it can do this in polynomial time. In fact, as the algorithm has a non-zero probability of selecting any of the subsets, there is a possibility of selecting the correct one. Unfortunately, this probability might be tiny and, as far as I can see, the algorithm might still require exponentially many iterations until it finds the optimum.
I agree with anonymous. However, I am curious as to how the maximum number of iterations of the Probabilistic MP algorithm influences the Donoho-Tanner phase transition as that transition not only reflect some type of condition on the measurement matrix but also the solvers capabilities. The phase transition does not care about the time it took to compute the solution.

On the topic of NP-Hardness, in Sparse Recovery using Smoothed $\ell^0$ (SL0): Convergence Analysis by Hosein Mohimani, Massoud Babaie-Zadeh, Irina Gorodnitsky, Christian Jutten, one can read the following about the SL0 algorithm:

...Solving (1) using a combinatorial search is NP-hard. Many alternative algorithms have been proposed to solve this problem. Two frequently used approaches are Matching Pursuit (MP) [11] and Basis Pursuit (BP) [8], which have many variants. MP is a fast algorithm but it cannot be guaranteed to find the sparsest solution. BP is based on replacing l_0 with the l_1 norm which can be minimized using Linear Programming techniques. BP is computationally more complex than MP, but it can find the sparsest solution with high probability, provided this solution is sufficiently sparse [13], [14], [2], [15] ....

Since (1) is NP-hard, one may wonder that proving convergence of SL0 (with a complexity growing in quadratic with scale) means that NP = P. This is not the case. Note that in BP, too, the guarantee that BP will find the solution of (1) does not mean that NP = P, because such a guarantee only exists in the case of a very sparse solution. Our analysis possesses a similar limitation, too.

In short, if one were to find a polynomial method for general "unconstrained" sparse recovery problem it would be equivalent to proving P=NP. The SL0 solver provides the guarantee that it solves the problem in polynomial time because of the additional constraint that the Asymmetric Restricted Isometry property is fulfilled by the measurement matrix. As it turns out this property is proven thanks to Random Matrix Theory. Eventually, the properties of SL0 are shown to be equivalent to an MP algorithm. In the case of the ProMP algorithm, a condition on the variance of the attendant Gram matrix of the measurement matrix has to be fulfilled.


Image Credit: NASA/JPL/Space Science Institute

1 comment:

Anonymous said...

Dear Igor and Atul,

Igor, you are correct in that a subset of sparse problems is not NP-hard, this is what CS has been showing over the last few years. With regard to randMP, Atul states in the post to you that 'It allows exact recovery when the number of nonzero coefficients S is upto M-1 for a M*N matrix with M less than N.' There is no mention of an extra condition, so the statement means that the worst case is that randMP has combinatorial complexity (unless NP=P). But this is not a selling point of the method, as the same can be said for exhaustive search.

Things become more interesting if we can nail down more concrete statements. For example, if there would be a statement of the form: the measurement matrix has f(K) rows and N>f(K) columns and the matrix is drawn i.i.d. from some distribution, then randMP recovers K-sparse signals in less than f(N,M,K) iterations with probability 1-epsilon. (Where the f's are different functions). But I can't find any such statement in the paper.

I can't even find the condition on 'the variance of the gram matrix.' aluded to in Atul's statement. It would indeed be of interest to have such a statement, however, if I understand Atul correctly, he is talking about a coherence type property here. The question is then how much better the statement is to say the standard MP result based on coherence.

Printfriendly