# ReLU as a switch, Associative memory etc

ReLU is a switch:
On=identity function (f(x)=x)=connect.
Off=zero=disconnect.
A ReLU neural network then is a switched composition of dot products (weighted sums.)
The dot product of a number of dot products is still a dot product.

For a particular input vector the state of each switch in the network is known.
The input to each ReLU is composition of dot products that can be condensed down to a single dot product with the input vector. I think that is interesting.

Anyway you can use that understanding of how ReLU neural networks work to make neural networks based on fast transforms like the FFT or Walsh Hadamard transform.

Associative memory based on hashing:
Hashing products random vectors that in higher dimensional space are approximately orthogonal. The weighted sum can easily be used to associate scalar values with a fixed number of those random vectors.
Using milder forms of hashing such as Locality Sensitive Hash (LHS) or reservoirs (as in reservoir computing) allows you to create a more general form of associative memory. And additional mathematics becomes relevant, the variance equation for linear combinations of random variables and the central limit theorem.

1 Like

Thanks for sharing your thoughts on this, it is indeed very interesting! I just wrote an ALife combining Fixed Filter Bank (fast transform) neural nets with associative memory. Which makes a Neural Turing Machine.
The example problems are not so great and not so well coded. The neural network and associative memory code are okay though. I did test them.
I’ll try to find other suitable problems for small Neural Turing Machines.
https://discourse.processing.org/t/artificial-life/19741

I found this paper on Butterfly Neural Networks:
https://arxiv.org/abs/1906.02256
They use the FFT (WHT) butterfly operator to reduce the computational cost of neural networks.
Conventional neural networks need n*n multiply-adds per layer where n is the width of the network. That is very time consuming and may be an inefficient use of weight parameters.