I wrote a comment on my blog post about growing fast transform neural networks 2 layers at a time.
Basically you can initialize the parametric activation functions to be the identity function (f(x)=x.) Then one layer is a forward transform, the second layer is the inverse transform. (The WHT is self inverse.)
Adding those 2 layers does nothing until you train the parametric activation functions to be something other than the identity function.
If you only train the 2 final layers you have added, the prior network layers remain unchanged and cannot catastrophically forget.
Thanks @SeanC4S for sharing your blog post here!
A simple visual example of a Fast Transform Neural Network:
It is similar to conventional ReLU based artificial neural networks.
IE. Dot product dot product switching according to sign.
We were having this conversation about some of the problems of conventional artificial neural networks over at Konduit: https://community.konduit.ai/t/fast-transform-neural-net-visual-example/497
You have one nonlinear term leading to n weights, each weight going to a different dimension. Unless there are substantial linear correlations wanted across the many dimensions that is a big waste.
Also we were talking about random projections. It seems there are many types.
Technically randomly flipping the + - signs of the elements of a vector is a random projection, as too is a random permutation. Those random projections do not spread out data. Adding a transform like the WHT or FFT after does result in a spreading random projection. There are also random projections that partially spread out data and ones that are invertible or non-invertible, that leave vector length changed or unchanged etc.
There was an unanswered question. Can you train Fast Transform Neural Networks with back-propagation? I was training them with evolution based optimization algorithms.
I wrote some code to find out.
The answer is, yes you can train them with back propagation: