Abstract
Continual learning (CL) is a setting in which a model learns from a stream of incoming data while avoiding to forget previously learned knowledge. Pre-trained language models (PLMs) have been successfully employed in continual learning of different natural language problems. With the rapid development of many continual learning methods and PLMs, understanding and disentangling their interactions become essential for continued improvement of continual learning performance. In this paper, we thoroughly compare the continual learning performance over the combination of 5 PLMs and 4 CL approaches on 3 benchmarks in 2 typical incremental settings. Our extensive experimental analyses reveal interesting performance differences across PLMs and across CL methods. Furthermore, our representativeness probing analyses dissect PLMs' performance characteristics in a layer-wise and task-wise manner, uncovering the extent to which their inner layers suffer from forgetting, and the effect of different CL approaches on each layer. Finally, our observations and analyses open up a number of important research questions that will inform and guide the design of effective continual learning techniques.
Original language | English |
---|---|
Title of host publication | International Conference on Learning Representations 2022 |
Editors | Yann LeCun |
Place of Publication | USA |
Publisher | OpenReview |
Number of pages | 17 |
Publication status | Published - 2022 |
Event | International Conference on Learning Representations 2022 - Online, United States of America Duration: 25 Apr 2022 → 29 Apr 2022 Conference number: 10th https://openreview.net/group?id=ICLR.cc/2022/Conference (Peer Reviews) https://iclr.cc/Conferences/2022 (Website) |
Conference
Conference | International Conference on Learning Representations 2022 |
---|---|
Abbreviated title | ICLR 2022 |
Country/Territory | United States of America |
Period | 25/04/22 → 29/04/22 |
Internet address |
|
Keywords
- Pre-trained Language Model
- Continual Learning