SummPip: unsupervised multi-document Summarization with sentence graph compression

Jinming Zhao, Ming Liu, Longxiang Gao, Yuan Jin, Lan Du, He Zhao, He Zhang, Gholamreza Haffari

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

3 Citations (Scopus)

Abstract

Obtaining training data for multi-document Summarization (MDS) is time consuming and resource-intensive, so recent neural models can only be trained for limited domains. In this paper, we propose SummPip: an unsupervised method for multi-document summarization, in which we convert the original documents to a sentence graph, taking both linguistic and deep representation into account, then apply spectral clustering to obtain multiple clusters of sentences, and finally compress each cluster to generate the final summary. Experiments on Multi-News and DUC-2004 datasets show that our method is competitive to previous unsupervised methods and is even comparable to the neural supervised approaches. In addition, human evaluation shows our system produces consistent and complete summaries compared to human written ones.

Original languageEnglish
Title of host publicationProceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsJaap Kamps, Vanessa Murdock, Ji-Rong Wen
Place of PublicationNew York NY USA
PublisherAssociation for Computing Machinery (ACM)
Pages1949-1952
Number of pages4
ISBN (Electronic)9781450380164
DOIs
Publication statusPublished - 2020
EventACM International Conference on Research and Development in Information Retrieval 2020 - Virtual, Online, China
Duration: 25 Jul 202030 Jul 2020
Conference number: 43rd
https://dl.acm.org/doi/proceedings/10.1145/3397271 (Proceedings)
https://sigir.org/sigir2020/ (Website)

Conference

ConferenceACM International Conference on Research and Development in Information Retrieval 2020
Abbreviated titleSIGIR 2020
Country/TerritoryChina
CityVirtual, Online
Period25/07/2030/07/20
Internet address

Keywords

  • cluster
  • sentence graph
  • summarization
  • text compression

Cite this