Thursday, March 06, 2008

Compressed Sensing: Finding needles in noisy haystacks, faster Sparse Measurements Matrix code and Treelet as a way to build bases on datasets

Rui Castro, Waheed Bajwa, Jarvis Haupt,Robert Nowak, Gil Raz present Finding needles in noisy haystacks.
The abstract reads:
The theory of compressed sensing shows that samples in the form of random projections are optimal for recovering sparse signals in high-dimensional spaces (i.e., finding needles in haystacks), provided the measurements are noiseless. However, noise is almost always present in applications, and compressed sensing suffers from it. The signal to noise ratio per dimension using random projections is very poor, since sensing energy is equally distributed over all dimensions. Consequently, the ability of compressed sensing to locate sparse components degrades significantly as noise increases. It is possible, in principle, to improve performance by “shaping” the projections to focus sensing energy in proper dimensions. The main question addressed here is, can projections be adaptively shaped to achieve this focusing effect? The answer is yes, and we demonstrate a simple, computationally efficient procedure that does so.
I very much like their crystal clear/visual explanation:
Incoherence between the projection vectors and the signal basis vectors is essential to compressed sensing, and is required for successful recovery from a small number of non-adaptive samples. The incoherence condition guarantees that one “spreads” the sensing energy over all the dimensions of the coordinate system of the basis. In essence, each compressive sample deposits an equal fraction of sensing energy in every dimension, making it possible to locate the sparse components without sensing directly in each and every dimension, which would require a number of samples equal to the length of the signal.
I just added a link to the article in the adaptive sampling techniques section of the Compressed Sensing Big Picture.

Raghunandan Kainkaryam seems to have implemented a faster Sparse Measurements Matrix code (initially implemented as a Monday Morning Algorithm and later included in the CS Code section, the algorithm is due to Radu Berinde and Piotr Indyk) in his blog entry entitled Shifted Transversal Designs are Expander Graphs. I have not checked on it yet.

Two other articles grabbed my attention on the internets:
Rank related properties, for Basis Pursuit and total variation regularization by Francois Malgouyres. The abstract reads:
This paper focuses on optimization problems containing an $l^1$ kind of regularity criterion and a smooth data fidelity term. A general theorem is adapted to this context. It gives an estimate of the distribution law of the "rank" of the solution to the optimization problems, when the initial datum follows a uniform (in a convex compact set) distribution law. It says that asymptotically, solution with a large rank are more and more likely. The main goal of this paper is to understand the meaning of this notion of rank for some energies which are commonly used in image processing. We study in details the energy whose level sets are defined as the convex hull of finite subset of R^N (think Basis Pursuit) and the total variation. For these energies, the notion of rank respectively relates to sparse representation and staircasing.


It seems interesting but I am not sure how it directly or indirectly impacts Compressed Sensing. Some of his more recent submissions also seem to be relevant to CS as well with the same caveat.
The second article that grabbed my interest is Treelets - An Adaptive Multi-Scale Basis for Sparse Unordered Data by Ann Lee, Boaz Nadler and Larry Wasserman. The abstract reads:
In many modern applications, including analysis of gene expression and text documents, the
data are noisy, high-dimensional, and unordered — with no particular meaning to the given order
of the variables. Yet, successful learning is often possible due to sparsity: the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper, we present treelets—a novel construction of multi-scale bases that extends wavelets to non-smooth signals. The method is fully adaptive, as it returns a hierarchical tree and an orthonormal basis which both reflect the internal structure of the data. Treelets are especially well-suited as a dimensionality reduction and feature selection tool prior to regression and classification, in situations where sample sizes are small and the data are sparse with unknown groupings of correlated or collinear variables. The method is also simple to implement and analyze theoretically. Here we describe a variety of situations where treelets perform better than principal component analysis as well as some common variable selection and cluster averaging schemes. We illustrate treelets on a blocked covariance model and on several data sets (hyperspectral image data, DNA microarray data, and internet advertisements) with highly complex dependencies between variables.
I can't wait to see the code that implements that algorithm. To understand better why this is relevant to previous entries, the figure above is Figure 6

Fig. 6 shows an example of how treelets can learn the covariance structure for colon cell nuclei (Tissuetype 3). The method learns both the tree structure and a basis through a series of Jacobi rotations (see top right panel). By construction, the basis vectors are localized and supported on nested clusters in the tree (see the bottom left and top left panels). As a comparison, we have also computed the PCA eigenvectors. The latter vectors are global and involve all the original variables (see bottom right panel).
In other words, this is another technique building an orthonormal basis for a specific dataset.

Finally, I am going to add the Talks on Compressed Sensing at Institut Poincare in Paris as pointed out by Gabriel Peyre. Some of them will be in English.

No comments:

Printfriendly