TY - GEN
T1 - Modeling semantics between programming codes and annotations
AU - Lu, Yihan
AU - Hsiao, Ihan
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/7/3
Y1 - 2018/7/3
N2 - It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.
AB - It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.
KW - Coding concept detection
KW - Programming semantics
KW - Semantic modeling
KW - Text based classification
UR - http://www.scopus.com/inward/record.url?scp=85051495563&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051495563&partnerID=8YFLogxK
U2 - 10.1145/3209542.3209578
DO - 10.1145/3209542.3209578
M3 - Conference contribution
AN - SCOPUS:85051495563
T3 - HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
SP - 101
EP - 105
BT - HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Hypertext and Social Media, HT 2018
Y2 - 9 July 2018 through 12 July 2018
ER -