A few years ago I wrote code for a very large associative memory that is quite fast. You can think of it as a soft (allows imprecise addressing) vector to vector hash table.
An artificial neural network would find it very difficult to learn how to use conventional digital memory that requires an absolutely precise address to retrieve information.
The associative memory uses (locality sensitive) hashing to select particular weight vectors for a particular input. It only needs to select a few (say 10 to 20) weight vectors from a potentially very large set of such vectors.
Since each vector is contiguous in memory it is relatively easy for a CPU/GPU cache system to deal with.
The selected weight vectors are then used with a simple associative memory algorithm based on random projections. The overall computational cost can be relatively low.
In fact it could allow extremely large associative memory when used with say a SSD.