About continuous learning applied in regression tasks

Hi, everyone. I have a couple of questions regarding continuous learning applied in regression tasks. I have just started my research in the field of CL. So I have to apologize in advance if I say something wrong.

What I need to do is to deal with a time-series based regression forecast task. The application setting is one in which new data is continuously collected and the statistical features of the dataset, such as probability distributions, will change. Furthermore, access to the old dataset may be limited in the future due to conditions. Therefore, IMO, continuous learning algorithms should be applied here.

My questions are:
Q1: Is there any published papers regarding CL applications in regression? I tried but failed to find any paper of that. If possible, please share the links with me. Thanks in advance!
Q2: Is there any methods to label a successive time series into different tasks? Assuming that the distribution of the input data changes over time, can each segment of the data set that follows the same distribution be labeled for a new task? Or is there a better way?

I would be really interested in discussing issues regrading CL in regression, other rather those mentioned above.

Thanks!

Do you need to predict the old data after training on the new one? Often when you deal with time series you are only interested in predicting novel data, and therefore you don’t care about catastrophic forgetting too much. If this is your case, you may find some literature by looking for work in “online learning” or “learning in nonstationary environments”.

For a starting point, you can look at this review: https://ieeexplore.ieee.org/document/7296710

Regarding Q2, for the task separation you typically divide the time series in chunks that have different distributions. If you have a simple way to detect distribution changes you can definitely use that to separate between tasks. The review above also shows some automatic approaches to detect concept drift.

Hi, thanks for your answer.
In this application scenario, my goal is to constantly learn new datasets to perfect the current prediction model. It is not only the new data that is of interest, but also the old data.
For example, predicting a household’s electricity consumption, first of all, electricity consumption changes over time, e.g. the situation varies in winter and summer, or there may be reduced electricity consumption due to travel during holidays. In such cases, if we pre-train the prediction model based on only a small data set, which usually can not completely describe the true probability distribution. At the point, I personally believe we can use continuous learning to learn the knowledge contained in the new data without forgetting the old knowledge.
Secondly, the user’s electricity consumption behavior is changeable, such as purchasing a new device that will turn on at weekend, or leaving the house for a few days each month since a certain day. When these new behaviors become cyclical activities, perhaps the model can view them as new knowledge points that can be learned while remembering the usual electricity usage.
So, I was wondering if continuous learning could be used in the regression prediction task, and how to define different tasks, as we do for classification.

I agree with you that it can be helpful to retain knowledge of the old task in your application.
Unfortunately, I am not familar with any work on regression problems with time series. We did work with sequences in [1] but only for sequence classification problems.

In your situation the tasks can be defined by splitting the time series. Each week you collect all the data about the electricity consumption. At the end of the week you have a new dataset that you can use to update the model.

[1] https://arxiv.org/abs/2004.04077

Thanks again for your answer and sharing!
I will read the paper attached in your reply. :grinning: