Friday, June 12, 2015

Random Maxout Features look like Quantized JL Embeddings

This Great Convergence in Action has another example today. Folks from different fields doing very similar things with sometimes different vocabulary.
Random Maxout Features by Youssef Mroueh, Steven Rennie, Vaibhava Goel
In this paper, we propose and study random maxout features, which are constructed by first projecting the input data onto sets of randomly generated vectors with Gaussian elements, and then outputing the maximum projection value for each set. We show that the resulting random feature map, when used in conjunction with linear models, allows for the locally linear estimation of the function of interest in classification tasks, and for the locally linear embedding of points when used for dimensionality reduction or data visualization. We derive generalization bounds for learning that assess the error in approximating locally linear functions by linear functions in the maxout feature space, and empirically evaluate the efficacy of the approach on the MNIST and TIMIT classification tasks.
Obviously, this is very similar to the Quantized Johnson-Lindenstrauss embeddings explored in part by Petros Boufounos and others. In the recent video by Petros, it starts at 15 minutes and 30 seconds.....


.... and the figure that resembles figure 1 of the previous paper is at 19 minutes and 57 seconds.

 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

1 comment:

manny said...

Thanks Igor, you did it again! Linking quantized JL/1bit-CS to Maxout is very eye opening. Until this the Maxout layer in CNN have been a bit mysterious to me. The linear layer is not a problem since it is really just sparse coding.

cheers.

Printfriendly