A neural model for method name generation from functional description

Sa Gao, Chunyang Chen, Zhenchang Xing, Yukun Ma, Wen Song, Shang-Wei Lin

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

1 Citation (Scopus)

Abstract

The names of software artifacts, e.g., method names, are important for software understanding and maintenance, as good names can help developers easily understand others' code. However, the existing naming guidelines are difficult for developers, especially novices, to come up with meaningful, concise and compact names for the variables, methods, classes and files. With the popularity of open source, an enormous amount of project source code can be accessed, and the exhaustiveness and instability of manually naming methods could now be relieved by automatically learning a naming model from a large code repository. Nevertheless, building a comprehensive naming system is still challenging, due to the gap between natural language functional descriptions and method names. Specifically, there are three challenges: how to model the relationship between the functional descriptions and formal method names, how to handle the explosion of vocabulary when dealing with large repositories, and how to leverage the knowledge learned from large repositories to a specific project. To answer these questions, we propose a neural network to directly generate readable method names from natural language description. The proposed method is built upon the encoder-decoder framework with the attention and copying mechanisms. Our experiments show that our method can generate meaningful and accurate method names and achieve significant improvement over the state-of-The-Art baseline models. We also address the cold-start problem using a training trick to utilize big data in Github for specific projects.

Original languageEnglish
Title of host publicationSANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering
Subtitle of host publicationFebruary 24-27, 2019 Hangzhou, China
EditorsXinyu Wang, David Lo, Emad Shihab
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages411-421
Number of pages11
ISBN (Electronic)9781728105918
ISBN (Print)9781728105925
DOIs
Publication statusPublished - 2019
EventIEEE International Conference on Software Analysis, Evolution, and Reengineering 2019 - Hangzhou, China
Duration: 24 Feb 201927 Feb 2019
Conference number: 26th
https://saner2019.github.io/

Conference

ConferenceIEEE International Conference on Software Analysis, Evolution, and Reengineering 2019
Abbreviated titleSANER 2019
CountryChina
CityHangzhou
Period24/02/1927/02/19
Internet address

Keywords

  • Encoder-Decoder Model
  • Naming Convention
  • Transfer Learning

Cite this

Gao, S., Chen, C., Xing, Z., Ma, Y., Song, W., & Lin, S-W. (2019). A neural model for method name generation from functional description. In X. Wang, D. Lo, & E. Shihab (Eds.), SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering: February 24-27, 2019 Hangzhou, China (pp. 411-421). [8667994] Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/SANER.2019.8667994
Gao, Sa ; Chen, Chunyang ; Xing, Zhenchang ; Ma, Yukun ; Song, Wen ; Lin, Shang-Wei. / A neural model for method name generation from functional description. SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering: February 24-27, 2019 Hangzhou, China. editor / Xinyu Wang ; David Lo ; Emad Shihab. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2019. pp. 411-421
@inproceedings{e6032a12a3a943d79ced57613dd88b92,
title = "A neural model for method name generation from functional description",
abstract = "The names of software artifacts, e.g., method names, are important for software understanding and maintenance, as good names can help developers easily understand others' code. However, the existing naming guidelines are difficult for developers, especially novices, to come up with meaningful, concise and compact names for the variables, methods, classes and files. With the popularity of open source, an enormous amount of project source code can be accessed, and the exhaustiveness and instability of manually naming methods could now be relieved by automatically learning a naming model from a large code repository. Nevertheless, building a comprehensive naming system is still challenging, due to the gap between natural language functional descriptions and method names. Specifically, there are three challenges: how to model the relationship between the functional descriptions and formal method names, how to handle the explosion of vocabulary when dealing with large repositories, and how to leverage the knowledge learned from large repositories to a specific project. To answer these questions, we propose a neural network to directly generate readable method names from natural language description. The proposed method is built upon the encoder-decoder framework with the attention and copying mechanisms. Our experiments show that our method can generate meaningful and accurate method names and achieve significant improvement over the state-of-The-Art baseline models. We also address the cold-start problem using a training trick to utilize big data in Github for specific projects.",
keywords = "Encoder-Decoder Model, Naming Convention, Transfer Learning",
author = "Sa Gao and Chunyang Chen and Zhenchang Xing and Yukun Ma and Wen Song and Shang-Wei Lin",
year = "2019",
doi = "10.1109/SANER.2019.8667994",
language = "English",
isbn = "9781728105925",
pages = "411--421",
editor = "Xinyu Wang and David Lo and Emad Shihab",
booktitle = "SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States of America",

}

Gao, S, Chen, C, Xing, Z, Ma, Y, Song, W & Lin, S-W 2019, A neural model for method name generation from functional description. in X Wang, D Lo & E Shihab (eds), SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering: February 24-27, 2019 Hangzhou, China., 8667994, IEEE, Institute of Electrical and Electronics Engineers, Piscataway NJ USA, pp. 411-421, IEEE International Conference on Software Analysis, Evolution, and Reengineering 2019, Hangzhou, China, 24/02/19. https://doi.org/10.1109/SANER.2019.8667994

A neural model for method name generation from functional description. / Gao, Sa; Chen, Chunyang; Xing, Zhenchang; Ma, Yukun; Song, Wen; Lin, Shang-Wei.

SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering: February 24-27, 2019 Hangzhou, China. ed. / Xinyu Wang; David Lo; Emad Shihab. Piscataway NJ USA : IEEE, Institute of Electrical and Electronics Engineers, 2019. p. 411-421 8667994.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

TY - GEN

T1 - A neural model for method name generation from functional description

AU - Gao, Sa

AU - Chen, Chunyang

AU - Xing, Zhenchang

AU - Ma, Yukun

AU - Song, Wen

AU - Lin, Shang-Wei

PY - 2019

Y1 - 2019

N2 - The names of software artifacts, e.g., method names, are important for software understanding and maintenance, as good names can help developers easily understand others' code. However, the existing naming guidelines are difficult for developers, especially novices, to come up with meaningful, concise and compact names for the variables, methods, classes and files. With the popularity of open source, an enormous amount of project source code can be accessed, and the exhaustiveness and instability of manually naming methods could now be relieved by automatically learning a naming model from a large code repository. Nevertheless, building a comprehensive naming system is still challenging, due to the gap between natural language functional descriptions and method names. Specifically, there are three challenges: how to model the relationship between the functional descriptions and formal method names, how to handle the explosion of vocabulary when dealing with large repositories, and how to leverage the knowledge learned from large repositories to a specific project. To answer these questions, we propose a neural network to directly generate readable method names from natural language description. The proposed method is built upon the encoder-decoder framework with the attention and copying mechanisms. Our experiments show that our method can generate meaningful and accurate method names and achieve significant improvement over the state-of-The-Art baseline models. We also address the cold-start problem using a training trick to utilize big data in Github for specific projects.

AB - The names of software artifacts, e.g., method names, are important for software understanding and maintenance, as good names can help developers easily understand others' code. However, the existing naming guidelines are difficult for developers, especially novices, to come up with meaningful, concise and compact names for the variables, methods, classes and files. With the popularity of open source, an enormous amount of project source code can be accessed, and the exhaustiveness and instability of manually naming methods could now be relieved by automatically learning a naming model from a large code repository. Nevertheless, building a comprehensive naming system is still challenging, due to the gap between natural language functional descriptions and method names. Specifically, there are three challenges: how to model the relationship between the functional descriptions and formal method names, how to handle the explosion of vocabulary when dealing with large repositories, and how to leverage the knowledge learned from large repositories to a specific project. To answer these questions, we propose a neural network to directly generate readable method names from natural language description. The proposed method is built upon the encoder-decoder framework with the attention and copying mechanisms. Our experiments show that our method can generate meaningful and accurate method names and achieve significant improvement over the state-of-The-Art baseline models. We also address the cold-start problem using a training trick to utilize big data in Github for specific projects.

KW - Encoder-Decoder Model

KW - Naming Convention

KW - Transfer Learning

UR - http://www.scopus.com/inward/record.url?scp=85064159567&partnerID=8YFLogxK

U2 - 10.1109/SANER.2019.8667994

DO - 10.1109/SANER.2019.8667994

M3 - Conference Paper

SN - 9781728105925

SP - 411

EP - 421

BT - SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering

A2 - Wang, Xinyu

A2 - Lo, David

A2 - Shihab, Emad

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - Piscataway NJ USA

ER -

Gao S, Chen C, Xing Z, Ma Y, Song W, Lin S-W. A neural model for method name generation from functional description. In Wang X, Lo D, Shihab E, editors, SANER ’19 - Proceedings of the 2019 IEEE 26th International Conference on Software Analysis, Evolution, and Reengineering: February 24-27, 2019 Hangzhou, China. Piscataway NJ USA: IEEE, Institute of Electrical and Electronics Engineers. 2019. p. 411-421. 8667994 https://doi.org/10.1109/SANER.2019.8667994