Modeling semantics between programming codes and annotations

Yihan Lu, Ihan Hsiao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.

Original languageEnglish (US)
Title of host publicationHT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
PublisherAssociation for Computing Machinery, Inc
Pages101-105
Number of pages5
ISBN (Electronic)9781450354271
DOIs
StatePublished - Jul 3 2018
Event29th ACM International Conference on Hypertext and Social Media, HT 2018 - Baltimore, United States
Duration: Jul 9 2018Jul 12 2018

Other

Other29th ACM International Conference on Hypertext and Social Media, HT 2018
CountryUnited States
CityBaltimore
Period7/9/187/12/18

Fingerprint

Computer programming
Semantics
Linguistics
Chemical analysis

Keywords

  • Coding concept detection
  • Programming semantics
  • Semantic modeling
  • Text based classification

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design

Cite this

Lu, Y., & Hsiao, I. (2018). Modeling semantics between programming codes and annotations. In HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media (pp. 101-105). Association for Computing Machinery, Inc. https://doi.org/10.1145/3209542.3209578

Modeling semantics between programming codes and annotations. / Lu, Yihan; Hsiao, Ihan.

HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media. Association for Computing Machinery, Inc, 2018. p. 101-105.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lu, Y & Hsiao, I 2018, Modeling semantics between programming codes and annotations. in HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media. Association for Computing Machinery, Inc, pp. 101-105, 29th ACM International Conference on Hypertext and Social Media, HT 2018, Baltimore, United States, 7/9/18. https://doi.org/10.1145/3209542.3209578
Lu Y, Hsiao I. Modeling semantics between programming codes and annotations. In HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media. Association for Computing Machinery, Inc. 2018. p. 101-105 https://doi.org/10.1145/3209542.3209578
Lu, Yihan ; Hsiao, Ihan. / Modeling semantics between programming codes and annotations. HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media. Association for Computing Machinery, Inc, 2018. pp. 101-105
@inproceedings{b6510bf8594441b38a9e63f0bde384ba,
title = "Modeling semantics between programming codes and annotations",
abstract = "It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.",
keywords = "Coding concept detection, Programming semantics, Semantic modeling, Text based classification",
author = "Yihan Lu and Ihan Hsiao",
year = "2018",
month = "7",
day = "3",
doi = "10.1145/3209542.3209578",
language = "English (US)",
pages = "101--105",
booktitle = "HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Modeling semantics between programming codes and annotations

AU - Lu, Yihan

AU - Hsiao, Ihan

PY - 2018/7/3

Y1 - 2018/7/3

N2 - It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.

AB - It is a common practice for programmers to leave annotations during program development. Most of the annotated documentations are predominantly being used as the archive of the coding events for limited developers. We hypothesize that these annotations captured mass amount of valuable information which can be utilized to identify similar codes or to examine code quality. However, due to the annotating behaviors vary and the language composition can be complex, this work sets out to investigate a systematic method to examine the annotation semantics and their relations with codes. We designed a semantic parser to extract concepts from codes and the corresponding annotations. Additionally, text mining techniques are applied to summarize linguistic features from the annotations. We then build models to predict concepts in programming code annotations. Results show that the proposed semantic modeling method achieved a higher performance compared to a random guessed baseline.

KW - Coding concept detection

KW - Programming semantics

KW - Semantic modeling

KW - Text based classification

UR - http://www.scopus.com/inward/record.url?scp=85051495563&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051495563&partnerID=8YFLogxK

U2 - 10.1145/3209542.3209578

DO - 10.1145/3209542.3209578

M3 - Conference contribution

SP - 101

EP - 105

BT - HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media

PB - Association for Computing Machinery, Inc

ER -