Multi-document summarization based on sentence cluster using Non-negative Matrix Factorization

Libin Yang, Xiaoyan Cai, Shirui Pan, Hang Dai, Dejun Mu

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Multi-document summarization aims to produce a concise summary that contains salient information from a set of source documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence cluster using Nonnegative Matrix Tri-Factorization (NMTF). The proposed framework employs NMTF to cluster sentences using inter-type relationships among documents, sentences and terms, and incorporate the intra-type information through manifold regularization. The most informative sentences are selected from each sentence cluster to form the summary. When evaluated on the DUC2004 and TAC2008 datasets, the performance of the proposed framework is comparable with that of the top three systems.

Original languageEnglish
Pages (from-to)1867-1879
Number of pages13
JournalJournal of Intelligent and Fuzzy Systems
Volume33
Issue number3
DOIs
Publication statusPublished - 2017
Externally publishedYes

Keywords

  • cluster-based ranking
  • manifold ranking
  • Multi-document summarization
  • non-negative matrix tri-factorization
  • sentence clustering

Cite this

Yang, Libin ; Cai, Xiaoyan ; Pan, Shirui ; Dai, Hang ; Mu, Dejun. / Multi-document summarization based on sentence cluster using Non-negative Matrix Factorization. In: Journal of Intelligent and Fuzzy Systems. 2017 ; Vol. 33, No. 3. pp. 1867-1879.
@article{26a13831a09b478cae76484e2a9532b7,
title = "Multi-document summarization based on sentence cluster using Non-negative Matrix Factorization",
abstract = "Multi-document summarization aims to produce a concise summary that contains salient information from a set of source documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence cluster using Nonnegative Matrix Tri-Factorization (NMTF). The proposed framework employs NMTF to cluster sentences using inter-type relationships among documents, sentences and terms, and incorporate the intra-type information through manifold regularization. The most informative sentences are selected from each sentence cluster to form the summary. When evaluated on the DUC2004 and TAC2008 datasets, the performance of the proposed framework is comparable with that of the top three systems.",
keywords = "cluster-based ranking, manifold ranking, Multi-document summarization, non-negative matrix tri-factorization, sentence clustering",
author = "Libin Yang and Xiaoyan Cai and Shirui Pan and Hang Dai and Dejun Mu",
year = "2017",
doi = "10.3233/JIFS-161613",
language = "English",
volume = "33",
pages = "1867--1879",
journal = "Journal of Intelligent and Fuzzy Systems",
issn = "1064-1246",
publisher = "IOS Press",
number = "3",

}

Multi-document summarization based on sentence cluster using Non-negative Matrix Factorization. / Yang, Libin; Cai, Xiaoyan; Pan, Shirui; Dai, Hang; Mu, Dejun.

In: Journal of Intelligent and Fuzzy Systems, Vol. 33, No. 3, 2017, p. 1867-1879.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Multi-document summarization based on sentence cluster using Non-negative Matrix Factorization

AU - Yang, Libin

AU - Cai, Xiaoyan

AU - Pan, Shirui

AU - Dai, Hang

AU - Mu, Dejun

PY - 2017

Y1 - 2017

N2 - Multi-document summarization aims to produce a concise summary that contains salient information from a set of source documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence cluster using Nonnegative Matrix Tri-Factorization (NMTF). The proposed framework employs NMTF to cluster sentences using inter-type relationships among documents, sentences and terms, and incorporate the intra-type information through manifold regularization. The most informative sentences are selected from each sentence cluster to form the summary. When evaluated on the DUC2004 and TAC2008 datasets, the performance of the proposed framework is comparable with that of the top three systems.

AB - Multi-document summarization aims to produce a concise summary that contains salient information from a set of source documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence cluster using Nonnegative Matrix Tri-Factorization (NMTF). The proposed framework employs NMTF to cluster sentences using inter-type relationships among documents, sentences and terms, and incorporate the intra-type information through manifold regularization. The most informative sentences are selected from each sentence cluster to form the summary. When evaluated on the DUC2004 and TAC2008 datasets, the performance of the proposed framework is comparable with that of the top three systems.

KW - cluster-based ranking

KW - manifold ranking

KW - Multi-document summarization

KW - non-negative matrix tri-factorization

KW - sentence clustering

UR - http://www.scopus.com/inward/record.url?scp=85028543935&partnerID=8YFLogxK

U2 - 10.3233/JIFS-161613

DO - 10.3233/JIFS-161613

M3 - Article

VL - 33

SP - 1867

EP - 1879

JO - Journal of Intelligent and Fuzzy Systems

JF - Journal of Intelligent and Fuzzy Systems

SN - 1064-1246

IS - 3

ER -