Thursday, April 03, 2008

Compressive Sampling: Sparse Sampling of Signal Innovations

A good introduction to the subject of sub-Nyquist sampling of signals with finite rate of innovation was written by Thierry Blu, Pier-Luigi Dragotti, Martin Vetterli, Pina Marziliano, Lionel Coulot in Sparse Sampling of Signal Innovations. The introduction reads as:

Signal acquisition and reconstruction is at the heart of signal processing, and sampling theorems provide the bridge between the continuous and the discrete-time worlds. The most celebrated and widely used sampling theorem is often attributed to Shannon (and many others, from Whittaker to Kotel′nikov and Nyquist, to name a few) and gives a sufficient condition, namely bandlimitedness, for an exact sampling and interpolation formula. The sampling rate, at twice the maximum frequency present in the signal, is usually called the Nyquist rate. Bandlimitedness, however, is not necessary as is well known but only rarely taken advantage of [1]. In this broader, nonbandlimited view, the question is: when can we acquire a signal using a sampling kernel followed by uniform sampling and perfectly reconstruct it?

The Shannon case is a particular example, where any signal from the subspace of bandlimited signals, denoted by BL, can be acquired through sampling and perfectly interpolated from the samples. Using the sinc kernel, or ideal low-pass filter, nonbandlimited signals will be projected onto the subspace BL. The question is: can we beat Shannon at this game, namely, acquire signals from outside of BL and still perfectly reconstruct? An obvious case is bandpass sampling and variations thereof. Less obvious are sampling schemes taking advantage of some sort of sparsity in the signal, and this is the central theme of this article. That is, instead of generic bandlimited signals, we consider the sampling of classes of nonbandlimited parametric signals. This allows us to circumvent Nyquist and perfectly sample and reconstruct signals using sparse sampling, at a rate characterized by how sparse they are per unit of time. In some sense, we sample at the rate of innovation of the signal by complying with Occam's razor principle [known as Lex Parcimoniæ or Law of Parsimony: Entia non svnt mvltiplicanda præter necessitatem, or, “Entities should not be multiplied beyond necessity” (from Wikipedia)].

Besides Shannon's sampling theorem, a second basic result that permeates signal processing is certainly Heisenberg's uncertainty principle, which suggests that a singular event in the frequency domain will be necessarily widely spread in the time domain. A superficial interpretation might lead one to believe that a perfect frequency localization requires a very long time observation. That this is not necessary is demonstrated by high resolution spectral analysis methods, which achieve very precise frequency localization using finite observation windows [2], [3]. The way around Heisenberg resides in a parametric approach, where the prior that the signal is a linear combination of sinusoids is put to contribution.

If by now you feel uneasy about slaloming around Nyquist, Shannon, and Heisenberg, do not worry. Estimation of sparse data is a classic problem in signal processing and communications, from estimating sinusoids in noise, to locating errors in digital transmissions. Thus, there is a wide variety of available techniques and algorithms. Also, the best possible performance is given by the Cramér-Rao lower bounds for this parametric estimation problem, and one can thus check how close to optimal a solution actually is.

We are thus ready to pose the basic questions of this article. Assume a sparse signal (be it in continuous or discrete time) observed through a sampling device that is a smoothing kernel followed by regular or uniform sampling. What is the minimum sampling rate (as opposed to Nyquist's rate, which is often infinite in cases of interest) that allows to recover the signal? What classes of sparse signals are possible? What are good observation kernels, and what are efficient and stable recovery algorithms? How does observation noise influence recovery, and what algorithms will approach optimal performance? How will these new techniques impact practical applications, from inverse problems to wideband communications? And finally, what is the relationship between the presented methods and classic methods as well as the recent advances in compressed sensing and sampling?

They address how their approach relates to Compressed Sensing (the last paragraph). An excerpt of that discussion shows:

.....In the absence of noise, it has been shown that this minimization provides the parameters of the innovation, with “overwhelming” probability [19] using O(K log N) measurements. Yet this method is not as direct as the annihilating filter method that does not require any iteration. Moreover, the compressed-sensing approach does not reach the critical sampling rate, unlike the method proposed here which almost achieves this goal (2K + 1 samples for 2K innovations). On the other hand, compressed sensing is not limited to uniform measurements of the form (14) and could potentially accommodate arbitrary sampling kernels—and not only the few ones that satisfy an annihilation property. This flexibility is certainly an attractive feature of compressed sensing.....
We talked before about how this type of results might explain some other results in the solving of the Linear Transport Equation but it has obviously a much larger field of application. We have also seen this Cramers-Rao bound before.

No comments:

Printfriendly