Pretrained Language Model in Continual Learning: A Comparative Study

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

26 Citations (Scopus)

Abstract

Continual learning (CL) is a setting in which a model learns from a stream of incoming data while avoiding to forget previously learned knowledge. Pre-trained language models (PLMs) have been successfully employed in continual learning of different natural language problems. With the rapid development of many continual learning methods and PLMs, understanding and disentangling their interactions become essential for continued improvement of continual learning performance. In this paper, we thoroughly compare the continual learning performance over the combination of 5 PLMs and 4 CL approaches on 3 benchmarks in 2 typical incremental settings. Our extensive experimental analyses reveal interesting performance differences across PLMs and across CL methods. Furthermore, our representativeness probing analyses dissect PLMs' performance characteristics in a layer-wise and task-wise manner, uncovering the extent to which their inner layers suffer from forgetting, and the effect of different CL approaches on each layer. Finally, our observations and analyses open up a number of important research questions that will inform and guide the design of effective continual learning techniques.

Original languageEnglish
Title of host publicationInternational Conference on Learning Representations 2022
EditorsYann LeCun
Place of PublicationUSA
PublisherOpenReview
Number of pages17
Publication statusPublished - 2022
EventInternational Conference on Learning Representations 2022 - Online, United States of America
Duration: 25 Apr 202229 Apr 2022
Conference number: 10th
https://openreview.net/group?id=ICLR.cc/2022/Conference (Peer Reviews)
https://iclr.cc/Conferences/2022 (Website)

Conference

ConferenceInternational Conference on Learning Representations 2022
Abbreviated titleICLR 2022
Country/TerritoryUnited States of America
Period25/04/2229/04/22
Internet address

Keywords

  • Pre-trained Language Model
  • Continual Learning

Cite this