## Friday, July 31, 2015

### The Spectral Norm of Random Inner-Product Kernel Matrices

When dealing with covariance matrices, one always wonder how thresholding coefficients will yield a matrix with similar properties. Today, we have some answers in that area:

We study the spectra of p×p random matrices K with off-diagonal (i,j) entry equal to n1/2k(XTiXj/n1/2), where Xi's are the rows of a p×n matrix with i.i.d. entries and k is a scalar function. It is known that under mild conditions, as n and p increase proportionally, the empirical spectral measure of K converges to a deterministic limit μ. We prove that if k is a polynomial and the distribution of entries of Xi is symmetric and satisfies a general moment bound, then K is the sum of two components, the first with spectral norm converging to μ (the maximum absolute value of the support of μ) and the second a perturbation of rank at most two. In certain cases, including when k is an odd polynomial function, the perturbation is 0 and the spectral norm K converges to μ. If the entries of Xi are Gaussian, we also prove that K converges to μ for a large class of odd non-polynomial functions k. In general, the perturbation may contribute spike eigenvalues to K outside of its limiting support, and we conjecture that they have deterministic limiting locations as predicted by a deformed GUE model. Our study of such matrices is motivated by the analysis of statistical thresholding procedures to estimate sparse covariance matrices from multivariate data, and our results imply an asymptotic approximation to the spectral norm error of such procedures when the population covariance is the identity.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### A Perspective on Future Research Directions in Information Theory

Here is the table of content:
1. What is Information Theory?
2. Communications
3. Networks and Networked Systems
4. Control Theory
5. Neuroscience
6. Signal Processing
7. Statistics and Machine Learning
8. Genomics and Molecular Biology
9. Theoretical Computer Science
10. Physics
11. Economics and Finance

A Perspective on Future Research Directions in Information Theory by Jeffrey G. Andrews, Alexandros Dimakis, Lara Dolecek, Michelle Effros, Muriel Medard, Olgica Milenkovic, Andrea Montanari, Sriram Vishwanath, Edmund Yeh, Randall Berry, Ken Duffy, Soheil Feizi, Saul Kato, Manolis Kellis, Stuart Licht, Jon Sorenson, Lav Varshney, Haris Vikalo

Information theory is rapidly approaching its 70th birthday. What are promising future directions for research in information theory? Where will information theory be having the most impact in 10-20 years? What new and emerging areas are ripe for the most impact, of the sort that information theory has had on the telecommunications industry over the last 60 years? How should the IEEE Information Theory Society promote high-risk new research directions and broaden the reach of information theory, while continuing to be true to its ideals and insisting on the intellectual rigor that makes its breakthroughs so powerful? These are some of the questions that an ad hoc committee (composed of the present authors) explored over the past two years. We have discussed and debated these questions, and solicited detailed inputs from experts in fields including genomics, biology, economics, and neuroscience. This report is the result of these discussions.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, July 30, 2015

### Dimensionality Reduction for k-Means Clustering and Low Rank Approximation

Random Projections as rank-k projection-cost preserving sketches (Th12) ! From the paper:
We start by noting that both problems are special cases of a general constrained k-rank approximation problem [DFK+04], which also includes problems related to sparse and nonnegative PCA[ PDK13, YZ13, APD14]. Then, following the coreset definitions of [FSS13], we introduce the concept of a projection-cost preserving sketch, an approximation where the sum of squared distances of A’s columns from any k-dimensional subspace (plus a fixed constant independent of the subspace) is multiplicatively close to that of A. This ensures that the cost of any k-rank projection of A is well approximated by A and thus, we can solve the general constrained k-rank approximation problem approximately for A using A.Next, we give several simple and efficient approaches for obtaining projection-cost preserving sketches with (1 + ǫ) relative error. All of these techniques simply require computing an SVD, multiplying by a random projection, random sampling, or some combination of the three.

Dimensionality Reduction for k-Means Clustering and Low Rank Approximation by Michael B. Cohen, Sam Elder, Cameron Musco, Christopher Musco, Madalina Persu

We show how to approximate a data matrix A with a much smaller sketch A~ that can be used to solve a general class of constrained k-rank approximation problems to within (1+ϵ) error. Importantly, this class of problems includes k-means clustering and unconstrained low rank approximation (i.e. principal component analysis). By reducing data points to just O(k) dimensions, our methods generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems.
For k-means dimensionality reduction, we provide (1+ϵ) relative error results for many common sketching techniques, including random row projection, column selection, and approximate SVD. For approximate principal component analysis, we give a simple alternative to known algorithms that has applications in the streaming setting. Additionally, we extend recent work on column-based matrix reconstruction, giving column subsets that not only `cover' a good subspace for $\bv{A}$, but can be used directly to compute this subspace.
Finally, for k-means clustering, we show how to achieve a (9+ϵ) approximation by Johnson-Lindenstrauss projecting data points to just O(logk/ϵ2) dimensions. This gives the first result that leverages the specific structure of k-means to achieve dimension independent of input size and sublinear in k.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, July 29, 2015

### Training Very Deep Networks - implementation -

As each layer of a deep neural  network can be viewed as an iteration step of a reconstruction solver, one wonders if and how to design them so that they more generally fit the generic behavior of traditional solvers (which use many iterations). In turn this may provide some insight on how to design reconstruction solvers (i.e. allow some information from far away iteration to come back into the loop). Here is the beginning of an answer today in the following preprint:

Training Very Deep Networks by Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber

Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highways. They are inspired by Long Short-Term Memory recurrent networks and use adaptive gating units to regulate the information flow. Even with hundreds of layers, highway networks can be trained directly through simple gradient descent. This enables the study of extremely deep and efficient architectures.
Some implementation of this algorithm can be found here at: http://people.idsia.ch/~rupesh/very_deep_learning/

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, July 28, 2015

### Compressive Sensing for #IoT: Photoplethysmography-Based Heart Rate Monitoring in Physical Activities via Joint Sparse Spectrum Reconstruction, TROIKA

Zhilin just sent me the following:

Hi Igor,

I hope all is well with you!

​ In recent years, I have been working on signal processing for wearable health monitoring, such as signal processing of vital signs in smart watch and other wearables. Particularly, I've applied compressed sensing to this area, and achieved some successes on heart rate monitoring for fitness tracking and health monitoring. So I think you and your blog's readers may be interested in the following work of my collaborators and me:

Zhilin Zhang, Zhouyue Pi, Benyuan Liu, TROIKA: A General Framework for Heart Rate Monitoring Using Wrist-Type Photoplethysmographic Signals During Intensive Physical Exercise, IEEE Trans. on Biomedical Engineering, vol. 62, no. 2, pp. 522-531, February 2015
​(preprint: ​http://arxiv.org/abs/1409.5181)

Zhilin Zhang, Photoplethysmography-Based Heart Rate Monitoring in Physical Activities via Joint Sparse Spectrum Reconstruction, IEEE Transactions on Biomedical Engineering, vol. 62, no. 8, pp. 1902-1910, August 2015
(preprint: http://arxiv.org/abs/1503.00688)

In fact, I think the problem of Photoplethysmography-based heart rate monitoring can be well formulated into various kinds of compressed sensing models, such as multiple measurement vector (MMV) model (as shown in my second paper), gridless compressed sensing model (also mentioned in my second paper), and time-varying sparsity model. Since the data are available online (the download link was given in my papers), I hope these data can encourage compressed sensing researchers to join this area, revealing potential values of compressed sensing in these real-life problems.

I will very appreciate if you can introduce my work on your blog.

Thank you!

Best regards,
Zhilin
Thanks Zhilin ! and yes, I am glad to cover work on how compressive sensing and related techniques can make sense of  IoT type of sensors (and work that includes datasets!). Without further ado:

TROIKA: A General Framework for Heart Rate Monitoring Using Wrist-Type Photoplethysmographic Signals During Intensive Physical Exercise by Zhilin Zhang, Zhouyue Pi, Benyuan Liu

Heart rate monitoring using wrist-type photoplethysmographic (PPG) signals during subjects' intensive exercise is a difficult problem, since the signals are contaminated by extremely strong motion artifacts caused by subjects' hand movements. So far few works have studied this problem. In this work, a general framework, termed TROIKA, is proposed, which consists of signal decomposiTion for denoising, sparse signal RecOnstructIon for high-resolution spectrum estimation, and spectral peaK trAcking with verification. The TROIKA framework has high estimation accuracy and is robust to strong motion artifacts. Many variants can be straightforwardly derived from this framework. Experimental results on datasets recorded from 12 subjects during fast running at the peak speed of 15 km/hour showed that the average absolute error of heart rate estimation was 2.34 beat per minute (BPM), and the Pearson correlation between the estimates and the ground-truth of heart rate was 0.992. This framework is of great values to wearable devices such as smart-watches which use PPG signals to monitor heart rate for fitness.

Photoplethysmography-Based Heart Rate Monitoring in Physical Activities via Joint Sparse Spectrum Reconstruction by Zhilin Zhang

Goal: A new method for heart rate monitoring using photoplethysmography (PPG) during physical activities is proposed. Methods: It jointly estimates spectra of PPG signals and simultaneous acceleration signals, utilizing the multiple measurement vector model in sparse signal recovery. Due to a common sparsity constraint on spectral coefficients, the method can easily identify and remove spectral peaks of motion artifact (MA) in PPG spectra. Thus, it does not need any extra signal processing modular to remove MA as in some other algorithms. Furthermore, seeking spectral peaks associated with heart rate is simplified. Results: Experimental results on 12 PPG datasets sampled at 25 Hz and recorded during subjects' fast running showed that it had high performance. The average absolute estimation error was 1.28 beat per minute and the standard deviation was 2.61 beat per minute. Conclusion and Significance: These results show that the method has great potential to be used for PPG-based heart rate monitoring in wearable devices for fitness tracking and health monitoring.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, July 27, 2015

### Random Mappings Designed for Commercial Search Engines - implementation -

Changing various non-text documents into vectors that have the characteristics of vector texts using thresholded random projections is the goal of today's paper. From the paper:
Although proving that our random mapping scheme works is involved, the scheme is remarkably simple. Our corpus X is a finite collection of vectors in R^d, normalized to have unit l_2 norm. To transform each vector in X, multiply each vector by a random matrix, then threshold each element.

Random mappings designed for commercial search engines by Roger Donaldson, Arijit Gupta, Yaniv Plan, Thomas Reimer

We give a practical random mapping that takes any set of documents represented as vectors in Euclidean space and then maps them to a sparse subset of the Hamming cube while retaining ordering of inter-vector inner products. Once represented in the sparse space, it is natural to index documents using commercial text-based search engines which are specialized to take advantage of this sparse and discrete structure for large-scale document retrieval. We give a theoretical analysis of the mapping scheme, characterizing exact asymptotic behavior and also giving non-asymptotic bounds which we verify through numerical simulations. We balance the theoretical treatment with several practical considerations; these allow substantial speed up of the method. We further illustrate the use of this method on search over two real data sets: a corpus of images represented by their color histograms, and a corpus of daily stock market index values.
Python codes used to generate results of that paper, including running example searches using the Whoosh search engine for Python, is here at: https://gitlab.com/dgpr-sparse-search/code

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Saturday, July 25, 2015

### Saturday Morning Video: Reverse Engineering the Human Visual System

At CVPR this year, there was another plenary speaker Jack Gallant who talked about Reverse Engineering the Human Visual System. The video is here.
Let us note the use of Blender to produce truth scenes which is really what other reserchers ( see A Probabilistic Theory of Deep Learning by Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk ) are suggesting to use in order to remove nuisance parameters and improve learning abilities.
We mentioned Jack Gallant and his group's work here before on Nuit Blanche, see:
Abstract of the talk:
The human brain is the most sophisticated image processing system known, capable of impressive feats of recognition and discrimination under challenging natural conditions. Reverse-engineering the brain might enable us to design artificial systems with the same capabilities. My laboratory uses a data-driven system identification approach to tackle this reverse-engineering problem. Our approach consists of four broad stages. First, we use functional MRI to measure brain activity while people watch naturalistic movies. We divide these data into two parts, one use to fit models and one for testing model predictions. Second, we use a system identification framework (based on multiple linearizing feature spaces) to model activity measured at each point in the brain. Third, we inspect the most accurate models to understand how the brain represents low-, mid- and high-level information in the movies. Finally, we use the estimated models to decode brain activity, reconstructing the structural and semantic content in the movies. Any effort to reverse-engineer the brain is inevitably limited by the spatial and temporal resolution of brain measurements, and at this time the resolution of human brain measurements is relatively poor. Still, as measurement technology progresses this framework could inform development of biologically-inspired computer vision systems, and it could aid in development of practical new brain reading technologies.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, July 24, 2015

### Forward - Backward Greedy Algorithms for Atomic Norm Regularization - implementation -

Nikhil just let me know of version 2 of his preprint, and the attendant code that goes with it. Let us note, as shown in the figure below, that when it is easy to compute, then the atomic norm does a better job than the nuclear norm relaxation.

Forward - Backward Greedy Algorithms for Atomic Norm Regularization by Nikhil Rao, Parikshit Shah, Stephen Wright

In many signal processing applications, the aim is to reconstruct a signal that has a simple representation with respect to a certain basis or frame. Fundamental elements of the basis known as "atoms" allow us to define "atomic norms" that can be used to formulate convex regularizations for the reconstruction problem. Efficient algorithms are available to solve these formulations in certain special cases, but an approach that works well for general atomic norms, both in terms of speed and reconstruction accuracy, remains to be found. This paper describes an optimization algorithm called CoGEnT that produces solutions with succinct atomic representations for reconstruction problems, generally formulated with atomic-norm constraints. CoGEnT combines a greedy selection scheme based on the conditional gradient approach with a backward (or "truncation") step that exploits the quadratic nature of the objective to reduce the basis size. We establish convergence properties and validate the algorithm via extensive numerical experiments on a suite of signal processing applications. Our algorithm and analysis also allow for inexact forward steps and for occasional enhancements of the current representation to be performed. CoGEnT can outperform the basic conditional gradient method, and indeed many methods that are tailored to specific applications, when the enhancement and truncation steps are defined appropriately. We also introduce several novel applications that are enabled by the atomic-norm framework, including tensor completion, moment problems in signal processing, and graph deconvolution.
The attendant implementation is on Nikhil's code page.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, July 23, 2015

### Slides and Video: What's Wrong with Deep Learning?

This was an invited talk by Yann LeCun at CVPR 2015. The video is here and the slides are here: What's Wrong with Deep Learning?  (h/t Andrei and Gabriel)

I note that what is missing in deep Learning revolves around unsupervised learning and is the subject of most interrogations in compressive sensing and advanced matrix factorizations as per the issue of encoding, decoding and regularizers.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, July 22, 2015

### Optimal approximate matrix product in terms of stable rank

Here is a new result for Randomized Numerical Linear Algebra.

Optimal approximate matrix product in terms of stable rank by Michael B. Cohen, Jelani Nelson, David P. Woodruff

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having $m = O(\tilde{r}/\varepsilon^2)$ rows. Here $\tilde{r}$ is the maximum stable rank, i.e. squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [MZ11, KVZ14], and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication shown in prior work, implies quantitative improvements for approximate least squares regression and low rank approximation. Our main result has also already been applied to improve dimensionality reduction guarantees for $k$-means clustering [CEMMP14], and implies new results for nonparametric regression [YPW15].
We also separately point out that the proof of the "BSS" deterministic row-sampling result of [BSS12] can be modified to show that for any matrices $A, B$ of stable rank at most $\tilde{r}$, one can achieve the spectral norm guarantee for approximate matrix multiplication of $A^T B$ by deterministically sampling $O(\tilde{r}/\varepsilon^2)$ rows that can be found in polynomial time. The original result of [BSS12] was for rank instead of stable rank. Our observation leads to a stronger version of a main theorem of [KMST10].

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.