How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation

Scott A. Crossley, Mihai Dascalu, Danielle McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilistic models, similar to LSA vector spaces, can provide insights into cognitive processing at semantic levels.

Original languageEnglish (US)
Title of host publicationFLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference
PublisherAAAI Press
Pages293-296
Number of pages4
ISBN (Electronic)9781577357872
StatePublished - 2017
Event30th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017 - Marco Island, United States
Duration: May 22 2017May 24 2017

Other

Other30th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017
CountryUnited States
CityMarco Island
Period5/22/175/24/17

Fingerprint

Semantics
Vector spaces
Processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Cite this

Crossley, S. A., Dascalu, M., & McNamara, D. (2017). How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation. In FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference (pp. 293-296). AAAI Press.

How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation. / Crossley, Scott A.; Dascalu, Mihai; McNamara, Danielle.

FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. AAAI Press, 2017. p. 293-296.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Crossley, SA, Dascalu, M & McNamara, D 2017, How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation. in FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. AAAI Press, pp. 293-296, 30th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017, Marco Island, United States, 5/22/17.
Crossley SA, Dascalu M, McNamara D. How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation. In FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. AAAI Press. 2017. p. 293-296
Crossley, Scott A. ; Dascalu, Mihai ; McNamara, Danielle. / How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation. FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference. AAAI Press, 2017. pp. 293-296
@inproceedings{c98c323902ea46558e5c6d57a1fa94f8,
title = "How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation",
abstract = "This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilistic models, similar to LSA vector spaces, can provide insights into cognitive processing at semantic levels.",
author = "Crossley, {Scott A.} and Mihai Dascalu and Danielle McNamara",
year = "2017",
language = "English (US)",
pages = "293--296",
booktitle = "FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference",
publisher = "AAAI Press",

}

TY - GEN

T1 - How important is size? An investigation of corpus size and meaning in both Latent Semantic Analysis and Latent Dirichlet Allocation

AU - Crossley, Scott A.

AU - Dascalu, Mihai

AU - McNamara, Danielle

PY - 2017

Y1 - 2017

N2 - This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilistic models, similar to LSA vector spaces, can provide insights into cognitive processing at semantic levels.

AB - This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilistic models, similar to LSA vector spaces, can provide insights into cognitive processing at semantic levels.

UR - http://www.scopus.com/inward/record.url?scp=85029473248&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029473248&partnerID=8YFLogxK

M3 - Conference contribution

SP - 293

EP - 296

BT - FLAIRS 2017 - Proceedings of the 30th International Florida Artificial Intelligence Research Society Conference

PB - AAAI Press

ER -