View in #questions-forum on Slack
@Suri_Bhasker_Sri_Harsha: Hello everyone,
Is generative replay using GAN for “visually complex” datasets like MNIST Fashion or CIFAR10 good? Did anyone try it? Are there any papers in the literature where they Pseudo Rehearsed a neural network for some visually complex dataset like CIFAR10 or ImageNet? Any pointers towards this topic would be greatly appreciated.
@Keiland: Maybe this will help https://arxiv.org/pdf/1802.03875.pdf
Major emphasis on the maybe, this article seems to raise some cause for a cautious read…
@Suri_Bhasker_Sri_Harsha: OK thanks. Actually I tried to perform Pseudo Rehearsal for CIFAR10 dataset using a GAN. The first major thing that I observed was that the generated images had the “color composition” of natural images. However, they were totally abstract. When I Pseudo Rehearsed my Neural net on it, the model’s performance plummeted to 10%. I was not sure if I was doing it right. MNIST Handwritten digits worked like magic by the way. Visually complex datasets were not working.
@Martin_Mundt: In our experience (https://arxiv.org/abs/1905.12019) generative replay works like a charm for “non-color” easy image or audio datasets in general with some extensions (content of the paper). However, it looks like generative models have a hard time with difficult image structure and color dependencies in general. While we have not reported color images in the paper (simply because it didn’t help to serve the main point of the paper and the results are still decently crappy), using a very complex autoregressive model that conditions each pixel upon prior pixels starts to somewhat help with this issue. It’s still pretty bad, but better than random and certainly a lot better than standard GAN/VAE techniques. For us this meant that its scalable in principle and I would intuitively expect it to work better with even more recent advances such as vector quantized latent spaces in VQVAE2 etc. If you want to play with it/talk about it we have our code online here: https://github.com/MrtnMndt/OCDVAEContinualLearning
arXiv.org: Unified Probabilistic Deep Continual Learning through Generative…
@Suri_Bhasker_Sri_Harsha: Thanks @Martin_Mundt
@Martin_Mundt: Let me know if you have any questions or if you are interested in taking a deeper look at said VQVAE or similar autoregressive generative models. Any feedback is also welcome
@Suri_Bhasker_Sri_Harsha: Actually I have another question related to MNIST Fashion dataset. I saw that the GANs were able to produce impressive photo realistic images. However, they fail to rehearse a network. I am bringing this up because you mentioned about “non-color” image datasets. Fashion dataset consists of gray scale images and the GAN also manages to produce photo realistic images. Then why is it hard to perform rehearsal on them? Do you have any insights?
@Martin_Mundt: I think so yeah. In short: the information necessary to retain discriminative representations for e.g. classification isn’t necessarily fully tied to “photorealism”. This is because I only need to encode the informtion that is needed to distinguish specific concepts.
In long: If you look at our paper you will visually see that autoregessive samples from a “global perspective” often look more messy/initially less visually appealing than images from non-autoregressive models. However, e.g. autoregressive models place way more emphasis on local structure than just on global structure (as this is what they were built to do). This can intrinsically lead to more diversity/local variation. So tying it back to your GAN question, I think a very important aspect is the question of whether the generative model is actually able to replay the distriution in terms of diversity, but also in terms of the necessary structure relevant for the supervised task. Neither of the latter is easy to gauge by just looking at the empirical human perceptual quality of a generated image from a GAN. Models like VAEs have the advantage here (or in fact some more advanced GANs) to have access to data likelihoods, which are more useful here.
Does that help?
@Suri_Bhasker_Sri_Harsha: "In short: the information necessary to retain discriminative representations for e.g. classification isn’t necessarily fully tied to “photorealism”. That is one beautiful conclusion. I came to the same conclusion. Good to see others sharing my opinion. Will surely take a look at your paper.
@vlomonaco: @Gabriele_Graffieti may be alble to add something on this