Impara Stability–Plasticity Dilemma | Understanding Catastrophic Forgetting

Scorri per mostrare il menu

Continual learning systems must balance two competing needs: adaptation and retention. Adaptation, or plasticity, is the system's ability to learn new information or skills rapidly. Retention, or stability, is the system's capacity to preserve previously acquired knowledge over time. In cognitive science, this mirrors how your brain can both learn new facts and remember old ones. In neural networks, plasticity allows a model to adjust its weights to accommodate new data, while stability ensures that these updates do not erase what was learned before. Striking this balance is essential for any system that must learn continuously from a stream of experiences.

Fixed-capacity models face a fundamental challenge when confronted with the stability–plasticity dilemma. Because these models have a limited number of parameters, they must share those resources across all tasks and data encountered over time. This parameter sharing means that updating the model to learn new tasks can overwrite or interfere with the representations needed for earlier tasks. Unlike a human brain, which can recruit new neurons or reorganize its structure, most neural networks have a fixed architecture: every new learning episode risks degrading performance on what was previously known. This is the crux of the dilemma for fixed-capacity models—there is no free lunch when it comes to dividing a finite set of weights between old and new knowledge.

The stability–plasticity dilemma can be formalized mathematically as a tension in the optimization objective of continual learning. Let $L_{old}(θ)$ represent the loss on previous tasks and $L_{new}(θ)$ the loss on the current task, both as functions of the model parameters $θ$ . When you minimize $L_{new}(θ)$ to learn new information, you may inadvertently increase $L_{old}(θ)$ , leading to forgetting. The overall optimization problem becomes a balancing act:

\text{minimize} \quad α * L_{old}(θ) + β * L_{new}(θ)

where $α$ and $β$ are weights reflecting the importance of retaining old knowledge versus adapting to new information. Because changes to $θ$ that decrease $L_{new}$ can increase $L_{old}$ , perfect retention and perfect adaptation are generally incompatible. This formalization captures the core trade-off at the heart of continual learning.

Several key assumptions underlie the stability–plasticity dilemma. First, model capacity is finite: you only have a limited number of parameters to allocate across all tasks. Second, the data distribution is non-stationary, meaning new tasks or experiences may differ significantly from those seen previously. Third, it is fundamentally impossible to achieve both perfect retention of all past knowledge and perfect adaptation to new data — some compromise is always necessary. These assumptions define the boundaries within which continual learning methods must operate.

Key takeaways:

The stability–plasticity dilemma is a central challenge in continual learning;
It is driven by the need to balance adaptation and retention;
The dilemma is rooted in the optimization dynamics of neural networks and the hard limits of finite capacity;
All proposed solutions must navigate these trade-offs, accepting that some forgetting or imperfect learning is inevitable whenever capacity and data are limited.

Tutto è chiaro?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 2

Chieda ad AI

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Sezione 1. Capitolo 2