View in #questions-forum on Slack
@Nelson: can someone here explain me the difference between continual learning and domain adapation? to me, they seem to tackle the same problems, but just using slightly different emphasis
@Gabriele_Graffieti: In domain adaptation we want to use a network trained with some training data distribution to classify related, but different, data. As an example, we want to use a network trained to classify livestock animals to classify wild animals. Mathematically, domain adaptation is when P(x), the distribution of data, changes between training and test.
Continual learning is when a model is continually trained with different data distributions, with the aim of learning how to deal with new data while the knowledge about old data is not forgotten. Resuming the previous example, in a CL scenario we want to train the network with images of livestock animals and, after some time, we want to train the same network with images of wild animal. In doing so the network have to learn how to classify wild animals, but at the same time its ability to classify livestock animals should remain the same.
To summarize: in DA we have a model trained on some data and we want to use it with slightly different target data. The aim of the process is to have good performance on target data without retraining a model from scratch.
In CL we want to continually train a model on many different data distributions (tasks), continually learning new things and forgetting as little as possible about old tasks.
@Nelson: From your explanation, they are still very similar. In DA, you may need to fine tune the pre-trained model, so, in a way, this is continual learning with two tasks. Right? I don’t think that DA is restricted to the cases where the training data comes from a distribution different than the test data, or is actually this the restriction and difference between DA and CL, i.e. in CL you want to continuall train with possibly more than 2 tasks, while in DA you just want to adapt (so not necessarily re-train, although I am not sure what you would do then) your pre-trained model to a test distribution that is different than the distribution from which the training data was sampled? Btw, how is domain adaptation different than fine tuning (or also known as transfer learning)?
@Gabriele_Graffieti: Yes you are right. In CL you want to continually train your model with a possibly large amount of tasks, while in DA you usually have a model trained with some data and you want to use it with some “similar but not so similar” data.
The important thing is that the problems they try to resolve are different. In DA you have a model trained on data A and you want to adapt it to work on data B. In CL you want to train a model that works well both on data A and data B, but you can’t have access to A and B data at the same time, so you have to train first on A only and then on B only.
Domain adaptation and transfer learning are quite interchangeable terms. There isn’t a general agreement on the difference between the two, but from  seems that domain adaptation is a subclass of transfer learning.
 Pan, Sinno Jialin, and Qiang Yang. “A survey on transfer learning.” IEEE Transactions on knowledge and data engineering 22, no. 10 (2010)
@vlomonaco: I’ve seen the term “incremental domain adaptation” used as a synonym of Continual Learning recently, but the focus of DA was totally different from the focus of CL, at least a few years ago.
This means a very different body of literature and algorithms… let’s hope these can be merged soon