Commit message generation for source code changes

Shengbin Xu, Yuan Yao, Feng Xu, Tianxiao Gu, Hanghang Tong, Jian Lu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CODISUM to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages.

Original languageEnglish (US)
Title of host publicationProceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
EditorsSarit Kraus
PublisherInternational Joint Conferences on Artificial Intelligence
Pages3975-3981
Number of pages7
ISBN (Electronic)9780999241141
StatePublished - Jan 1 2019
Externally publishedYes
Event28th International Joint Conference on Artificial Intelligence, IJCAI 2019 - Macao, China
Duration: Aug 10 2019Aug 16 2019

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2019-August
ISSN (Print)1045-0823

Conference

Conference28th International Joint Conference on Artificial Intelligence, IJCAI 2019
CountryChina
CityMacao
Period8/10/198/16/19

Fingerprint

Copying
Semantics

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Xu, S., Yao, Y., Xu, F., Gu, T., Tong, H., & Lu, J. (2019). Commit message generation for source code changes. In S. Kraus (Ed.), Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019 (pp. 3975-3981). (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2019-August). International Joint Conferences on Artificial Intelligence.

Commit message generation for source code changes. / Xu, Shengbin; Yao, Yuan; Xu, Feng; Gu, Tianxiao; Tong, Hanghang; Lu, Jian.

Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. ed. / Sarit Kraus. International Joint Conferences on Artificial Intelligence, 2019. p. 3975-3981 (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2019-August).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Xu, S, Yao, Y, Xu, F, Gu, T, Tong, H & Lu, J 2019, Commit message generation for source code changes. in S Kraus (ed.), Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-August, International Joint Conferences on Artificial Intelligence, pp. 3975-3981, 28th International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, 8/10/19.
Xu S, Yao Y, Xu F, Gu T, Tong H, Lu J. Commit message generation for source code changes. In Kraus S, editor, Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. International Joint Conferences on Artificial Intelligence. 2019. p. 3975-3981. (IJCAI International Joint Conference on Artificial Intelligence).
Xu, Shengbin ; Yao, Yuan ; Xu, Feng ; Gu, Tianxiao ; Tong, Hanghang ; Lu, Jian. / Commit message generation for source code changes. Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019. editor / Sarit Kraus. International Joint Conferences on Artificial Intelligence, 2019. pp. 3975-3981 (IJCAI International Joint Conference on Artificial Intelligence).
@inproceedings{091bc9dfecac4e4eb149e99ee3adee3e,
title = "Commit message generation for source code changes",
abstract = "Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CODISUM to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages.",
author = "Shengbin Xu and Yuan Yao and Feng Xu and Tianxiao Gu and Hanghang Tong and Jian Lu",
year = "2019",
month = "1",
day = "1",
language = "English (US)",
series = "IJCAI International Joint Conference on Artificial Intelligence",
publisher = "International Joint Conferences on Artificial Intelligence",
pages = "3975--3981",
editor = "Sarit Kraus",
booktitle = "Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019",

}

TY - GEN

T1 - Commit message generation for source code changes

AU - Xu, Shengbin

AU - Yao, Yuan

AU - Xu, Feng

AU - Gu, Tianxiao

AU - Tong, Hanghang

AU - Lu, Jian

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CODISUM to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages.

AB - Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CODISUM to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages.

UR - http://www.scopus.com/inward/record.url?scp=85074905340&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074905340&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85074905340

T3 - IJCAI International Joint Conference on Artificial Intelligence

SP - 3975

EP - 3981

BT - Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019

A2 - Kraus, Sarit

PB - International Joint Conferences on Artificial Intelligence

ER -