[ContinualAI Reading Group] Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations

[June 19th, 2020] ContinualAI Reading Group : “Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations

Abstract: Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not permit the use of non-differentiable activation functions and is inherently sequential, making parallelization of the underlying training process difficult. Here, we propose the Parallel Temporal Neural Coding Network (P-TNCN), a biologically inspired model trained by the learning algorithm we call Local Representation Alignment. It aims to resolve the difficulties and problems that plague recurrent networks trained by back-propagation through time. The architecture requires neither unrolling in time nor the derivatives of its internal activation functions. We compare our model and learning procedure to other back-propagation through time alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization. We show that it outperforms these on sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we denote as Bouncing NotMNIST, and Penn Treebank. Notably, our approach can in some instances outperform full back-propagation through time as well as variants such as sparse attentive back-tracking. Significantly, the hidden unit correction phase of P-TNCN allows it to adapt to new datasets even if its synaptic weights are held fixed (zero-shot adaptation) and facilitates retention of prior generative knowledge when faced with a task sequence. We present results that show the P-TNCN’s ability to conduct zero-shot adaptation and online continual sequence modeling.

The speakers for this reading group were:

Alexander Ororbia

:round_pushpin: Youtube recording : https://youtu.be/EWNyqWe6t10
:round_pushpin: Paper pre-print : https://arxiv.org/abs/1810.07411
:round_pushpin: Slides : https://drive.google.com/file/d/1KRB6_7voggdIZExmUYhWctW5I1wuTCTa/view?usp=sharing