Monday, April 17, 2017

F2F: A Library For Fast Kernel Expansions

( Personal message: I will at ICLR next week, let's grab some coffee if you are there. )

F2F is a C++ library for large-scale machine learning. It contains a CPU optimized implementation of the Fastfood algorithm, that allows the computation of approximated kernel expansions in loglinear time. The algorithm requires to compute the product of Walsh-Hadamard Transform (WHT) matrices. A cache friendly SIMD Fast Walsh-Hadamard Transform (FWHT) that achieves compelling speed and outperforms current state-of-the-art methods has been developed. F2F allows to obtain non-linear classification combining Fastfood and a linear classifier.
I am told by one of the author that the library should be out at some point in time.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

2 comments:

SeanVN said...

You can get 5000 65536-point 32 bit floating point Fast Walsh Hadamard Transforms per second on a single Intel Celeron core. They must be doing something terribly wrong.
https://drive.google.com/open?id=0BwsgMLjV0BnhMG5leTFkYWJMLU0

You should be able to get something like 75 million to 100 million 65536-point FWHT's per second on a top of the range GPU.

The basic algorithm is:

sub wht(x() as single)
dim as ulongint i,j,k
dim as ulongint hs=1,m=ubound(x) 'n-1
dim as single a,b
while hs<=m 'Walsh Hadamard transform
i=0
while i<=m
k=i+hs
while i<k
a=x(i)
b=x(i+hs)
x(i)=a+b
x(i+hs)=a-b
i+=1
wend
i=k+hs
wend
hs+=hs
wend
end sub

SeanVN said...

That should have been 75 to 100 million 256-point FWHTs per second on a single GPU.

https://estudogeral.sib.uc.pt/bitstream/10316/27842/1/Optimized%20Fast%20Walsh%E2%80%93Hadamard%20Transform.pdf

Probably about 1 to 3 million 65536-point FWHTs/sec on a single GPU. 1 65536-point FWHT/sec needs 1 MegaFlop/sec.
The only thing you then need for random projections (RP) is to do a random sign flip of the data before the FWHT. For better quality repeat. Including random permutations could give you more entropy per step but is expensive time-wise.
One thing you could do is make a "soft" hash table as a possible discriminator for GAN neural nets, or as soft memory for deep networks.
https://drive.google.com/open?id=0BwsgMLjV0BnhellXODdLZWlrOWc


Printfriendly