Identifying Trends in Technologies and Programming Languages Using Topic Modeling

Vishal Johri, Srividya Bansal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.

Original languageEnglish (US)
Title of host publicationProceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages391-396
Number of pages6
Volume2018-January
ISBN (Electronic)9781538644072
DOIs
StatePublished - Apr 9 2018
Event12th IEEE International Conference on Semantic Computing, ICSC 2018 - Laguna Hills, United States
Duration: Jan 31 2018Feb 2 2018

Other

Other12th IEEE International Conference on Semantic Computing, ICSC 2018
CountryUnited States
CityLaguna Hills
Period1/31/182/2/18

Fingerprint

Computer programming languages
Websites
Computer science
Software engineering
Programming
Modeling
Language
Application programs
Data mining
Web sites

Keywords

  • Latent Dirichlet Allocation (LDA)
  • Machine Learning
  • Natural Language Processing
  • Topic modeling

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Human-Computer Interaction
  • Information Systems and Management

Cite this

Johri, V., & Bansal, S. (2018). Identifying Trends in Technologies and Programming Languages Using Topic Modeling. In Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018 (Vol. 2018-January, pp. 391-396). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICSC.2018.00078

Identifying Trends in Technologies and Programming Languages Using Topic Modeling. / Johri, Vishal; Bansal, Srividya.

Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. p. 391-396.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Johri, V & Bansal, S 2018, Identifying Trends in Technologies and Programming Languages Using Topic Modeling. in Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018. vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 391-396, 12th IEEE International Conference on Semantic Computing, ICSC 2018, Laguna Hills, United States, 1/31/18. https://doi.org/10.1109/ICSC.2018.00078
Johri V, Bansal S. Identifying Trends in Technologies and Programming Languages Using Topic Modeling. In Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018. Vol. 2018-January. Institute of Electrical and Electronics Engineers Inc. 2018. p. 391-396 https://doi.org/10.1109/ICSC.2018.00078
Johri, Vishal ; Bansal, Srividya. / Identifying Trends in Technologies and Programming Languages Using Topic Modeling. Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. pp. 391-396
@inproceedings{e80205f6962e4769a80bacb4269f3720,
title = "Identifying Trends in Technologies and Programming Languages Using Topic Modeling",
abstract = "Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.",
keywords = "Latent Dirichlet Allocation (LDA), Machine Learning, Natural Language Processing, Topic modeling",
author = "Vishal Johri and Srividya Bansal",
year = "2018",
month = "4",
day = "9",
doi = "10.1109/ICSC.2018.00078",
language = "English (US)",
volume = "2018-January",
pages = "391--396",
booktitle = "Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Identifying Trends in Technologies and Programming Languages Using Topic Modeling

AU - Johri, Vishal

AU - Bansal, Srividya

PY - 2018/4/9

Y1 - 2018/4/9

N2 - Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.

AB - Technology question and answer websites are a great source of technical knowledge. Users of these websites raise various types of technical questions, and answer them. These questions cover a wide range of domains in Computer Science like Networks, Data Mining, Multimedia, Multi-threading, Web Development, Mobile App Development, etc. Analyzing the actual textual content of these websites can help computer science and software engineering community better understand the needs of developers and learn about the current trends in technology. In this project, textual data from famous question and answer website called StackOverflow, is analyzed using Latent Dirichlet Allocation (LDA) topic modeling algorithm. The results show that this techniques help discover dominant topics in developer discussions. These topics are analyzed to find a number of interesting observations such as popular technology/language, impact of a technology, technology trends over time, relationship of a technology/language with other technologies and comparison of technologies addressing an area of computer science or software engineering.

KW - Latent Dirichlet Allocation (LDA)

KW - Machine Learning

KW - Natural Language Processing

KW - Topic modeling

UR - http://www.scopus.com/inward/record.url?scp=85048384043&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048384043&partnerID=8YFLogxK

U2 - 10.1109/ICSC.2018.00078

DO - 10.1109/ICSC.2018.00078

M3 - Conference contribution

VL - 2018-January

SP - 391

EP - 396

BT - Proceedings - 12th IEEE International Conference on Semantic Computing, ICSC 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -