Automating reading comprehension by generating question and answer pairs

Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, Yuan Fang Li

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two-stage process to generate question-answer pairs from the text. For the first stage, we present alternatives for encoding the span of the pivotal answer in the sentence using Pointer Networks. In our second stage, we employ sequence to sequence models for question generation, enhanced with rich linguistic features. Finally, global attention and answer encoding are used for generating the question most relevant to the answer. We motivate and linguistically analyze the role of each component in our framework and consider compositions of these. This analysis is supported by extensive experimental evaluations. Using standard evaluation metrics as well as human evaluations, our experimental results validate the significant improvement in the quality of questions generated by our framework over the state-of-the-art. The technique presented here represents another step towards more automated reading comprehension assessment. We also present a live system (Demo of the system is available at https://www.cse.iitb.ac.in/~vishwajeet/autoqg.html.) to demonstrate the effectiveness of our approach.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining
Subtitle of host publication22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III
EditorsDinh Phung, Vincent S. Tseng, Geoffrey I. Webb, Bao Ho, Mohadeseh Ganji, Lida Rashidi
Place of PublicationCham Switzerland
PublisherSpringer
Pages335-348
Number of pages14
ISBN (Electronic)9783319930404
ISBN (Print)9783319930398
DOIs
Publication statusPublished - 2018
EventPacific-Asia Conference on Knowledge Discovery and Data Mining 2018 - Grand Hyatt, Melbourne, Australia
Duration: 3 Jun 20186 Jun 2018
Conference number: 22nd
http://pakdd2018.medmeeting.org/Content/92892

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume10939
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining 2018
Abbreviated titlePAKDD 2018
CountryAustralia
CityMelbourne
Period3/06/186/06/18
Internet address

Keywords

  • Pointer network
  • Question generation
  • Sequence to sequence modeling

Cite this

Kumar, V., Boorla, K., Meena, Y., Ramakrishnan, G., & Li, Y. F. (2018). Automating reading comprehension by generating question and answer pairs. In D. Phung, V. S. Tseng, G. I. Webb, B. Ho, M. Ganji, & L. Rashidi (Eds.), Advances in Knowledge Discovery and Data Mining : 22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III (pp. 335-348). (Lecture Notes in Computer Science ; Vol. 10939 ). Cham Switzerland: Springer. https://doi.org/10.1007/978-3-319-93040-4_27
Kumar, Vishwajeet ; Boorla, Kireeti ; Meena, Yogesh ; Ramakrishnan, Ganesh ; Li, Yuan Fang. / Automating reading comprehension by generating question and answer pairs. Advances in Knowledge Discovery and Data Mining : 22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III. editor / Dinh Phung ; Vincent S. Tseng ; Geoffrey I. Webb ; Bao Ho ; Mohadeseh Ganji ; Lida Rashidi. Cham Switzerland : Springer, 2018. pp. 335-348 (Lecture Notes in Computer Science ).
@inproceedings{0314ff439f7e4a23a4a50df89b8549f8,
title = "Automating reading comprehension by generating question and answer pairs",
abstract = "Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two-stage process to generate question-answer pairs from the text. For the first stage, we present alternatives for encoding the span of the pivotal answer in the sentence using Pointer Networks. In our second stage, we employ sequence to sequence models for question generation, enhanced with rich linguistic features. Finally, global attention and answer encoding are used for generating the question most relevant to the answer. We motivate and linguistically analyze the role of each component in our framework and consider compositions of these. This analysis is supported by extensive experimental evaluations. Using standard evaluation metrics as well as human evaluations, our experimental results validate the significant improvement in the quality of questions generated by our framework over the state-of-the-art. The technique presented here represents another step towards more automated reading comprehension assessment. We also present a live system (Demo of the system is available at https://www.cse.iitb.ac.in/~vishwajeet/autoqg.html.) to demonstrate the effectiveness of our approach.",
keywords = "Pointer network, Question generation, Sequence to sequence modeling",
author = "Vishwajeet Kumar and Kireeti Boorla and Yogesh Meena and Ganesh Ramakrishnan and Li, {Yuan Fang}",
year = "2018",
doi = "10.1007/978-3-319-93040-4_27",
language = "English",
isbn = "9783319930398",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "335--348",
editor = "Phung, {Dinh } and {S. Tseng}, Vincent and {I. Webb}, {Geoffrey } and Ho, {Bao } and Ganji, {Mohadeseh } and Rashidi, {Lida }",
booktitle = "Advances in Knowledge Discovery and Data Mining",

}

Kumar, V, Boorla, K, Meena, Y, Ramakrishnan, G & Li, YF 2018, Automating reading comprehension by generating question and answer pairs. in D Phung, V S. Tseng, G I. Webb, B Ho, M Ganji & L Rashidi (eds), Advances in Knowledge Discovery and Data Mining : 22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III. Lecture Notes in Computer Science , vol. 10939 , Springer, Cham Switzerland, pp. 335-348, Pacific-Asia Conference on Knowledge Discovery and Data Mining 2018, Melbourne, Australia, 3/06/18. https://doi.org/10.1007/978-3-319-93040-4_27

Automating reading comprehension by generating question and answer pairs. / Kumar, Vishwajeet; Boorla, Kireeti; Meena, Yogesh; Ramakrishnan, Ganesh; Li, Yuan Fang.

Advances in Knowledge Discovery and Data Mining : 22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III. ed. / Dinh Phung; Vincent S. Tseng; Geoffrey I. Webb; Bao Ho; Mohadeseh Ganji; Lida Rashidi. Cham Switzerland : Springer, 2018. p. 335-348 (Lecture Notes in Computer Science ; Vol. 10939 ).

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - Automating reading comprehension by generating question and answer pairs

AU - Kumar, Vishwajeet

AU - Boorla, Kireeti

AU - Meena, Yogesh

AU - Ramakrishnan, Ganesh

AU - Li, Yuan Fang

PY - 2018

Y1 - 2018

N2 - Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two-stage process to generate question-answer pairs from the text. For the first stage, we present alternatives for encoding the span of the pivotal answer in the sentence using Pointer Networks. In our second stage, we employ sequence to sequence models for question generation, enhanced with rich linguistic features. Finally, global attention and answer encoding are used for generating the question most relevant to the answer. We motivate and linguistically analyze the role of each component in our framework and consider compositions of these. This analysis is supported by extensive experimental evaluations. Using standard evaluation metrics as well as human evaluations, our experimental results validate the significant improvement in the quality of questions generated by our framework over the state-of-the-art. The technique presented here represents another step towards more automated reading comprehension assessment. We also present a live system (Demo of the system is available at https://www.cse.iitb.ac.in/~vishwajeet/autoqg.html.) to demonstrate the effectiveness of our approach.

AB - Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two-stage process to generate question-answer pairs from the text. For the first stage, we present alternatives for encoding the span of the pivotal answer in the sentence using Pointer Networks. In our second stage, we employ sequence to sequence models for question generation, enhanced with rich linguistic features. Finally, global attention and answer encoding are used for generating the question most relevant to the answer. We motivate and linguistically analyze the role of each component in our framework and consider compositions of these. This analysis is supported by extensive experimental evaluations. Using standard evaluation metrics as well as human evaluations, our experimental results validate the significant improvement in the quality of questions generated by our framework over the state-of-the-art. The technique presented here represents another step towards more automated reading comprehension assessment. We also present a live system (Demo of the system is available at https://www.cse.iitb.ac.in/~vishwajeet/autoqg.html.) to demonstrate the effectiveness of our approach.

KW - Pointer network

KW - Question generation

KW - Sequence to sequence modeling

UR - http://www.scopus.com/inward/record.url?scp=85049367444&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-93040-4_27

DO - 10.1007/978-3-319-93040-4_27

M3 - Conference Paper

SN - 9783319930398

T3 - Lecture Notes in Computer Science

SP - 335

EP - 348

BT - Advances in Knowledge Discovery and Data Mining

A2 - Phung, Dinh

A2 - S. Tseng, Vincent

A2 - I. Webb, Geoffrey

A2 - Ho, Bao

A2 - Ganji, Mohadeseh

A2 - Rashidi, Lida

PB - Springer

CY - Cham Switzerland

ER -

Kumar V, Boorla K, Meena Y, Ramakrishnan G, Li YF. Automating reading comprehension by generating question and answer pairs. In Phung D, S. Tseng V, I. Webb G, Ho B, Ganji M, Rashidi L, editors, Advances in Knowledge Discovery and Data Mining : 22nd Pacific-Asia Conference, PAKDD 2018 Melbourne, VIC, Australia, June 3–6, 2018 Proceedings, Part III. Cham Switzerland: Springer. 2018. p. 335-348. (Lecture Notes in Computer Science ). https://doi.org/10.1007/978-3-319-93040-4_27