Mixture-of-Partitions: infusing large biomedical knowledge graphs into BERT

Zaiqiao Meng, Fangyu Liu, Thomas Hikaru Clark, Ehsan Shareghi, Nigel Collier

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Infusing factual knowledge into pre-trained models is fundamental for many knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions (MoP), an infusion approach that can handle a very large knowledge graph (KG) by partitioning it into smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters. To leverage the overall factual knowledge for a target task, these sub-graph adapters are further fine-tuned along with the underlying BERT through a mixture layer. We evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on six downstream tasks (inc. NLI, QA, Classification), and the results show that our MoP consistently enhances the underlying BERTs in task performance, and achieves new SOTA performances on five evaluated datasets.
Original languageEnglish
Title of host publicationProceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
EditorsXuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Place of PublicationStroudsburg PA USA
PublisherAssociation for Computational Linguistics (ACL)
Pages4672–4681
Number of pages10
ISBN (Electronic)9781955917094
Publication statusPublished - 2021
EventEmpirical Methods in Natural Language Processing 2021 - Online, Dominican Republic
Duration: 7 Nov 202111 Nov 2021
https://2021.emnlp.org/ (Website)
https://aclanthology.org/2021.emnlp-main.0/ (Proceedings)

Conference

ConferenceEmpirical Methods in Natural Language Processing 2021
Abbreviated titleEMNLP 2021
Country/TerritoryDominican Republic
Period7/11/2111/11/21
Internet address

Cite this