Abstract

A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)." In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a "story," if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <Subject, Verb, Object> triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.

Original languageEnglish (US)
Title of host publicationProceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
Pages573-580
Number of pages8
DOIs
StatePublished - 2012
Event2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012 - Istanbul, Turkey
Duration: Aug 26 2012Aug 29 2012

Other

Other2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
CountryTurkey
CityIstanbul
Period8/26/128/29/12

Fingerprint

Classifiers
Semantics
RSS
Linguistics
Support vector machines
Websites
Agglomeration
Data storage equipment

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Cite this

Ceran, B., Karad, R., Mandvekar, A., Corman, S., & Davulcu, H. (2012). A semantic triplet based story classifier. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012 (pp. 573-580). [6425707] https://doi.org/10.1109/ASONAM.2012.97

A semantic triplet based story classifier. / Ceran, Betul; Karad, Ravi; Mandvekar, Ajay; Corman, Steven; Davulcu, Hasan.

Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. p. 573-580 6425707.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ceran, B, Karad, R, Mandvekar, A, Corman, S & Davulcu, H 2012, A semantic triplet based story classifier. in Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012., 6425707, pp. 573-580, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012, Istanbul, Turkey, 8/26/12. https://doi.org/10.1109/ASONAM.2012.97
Ceran B, Karad R, Mandvekar A, Corman S, Davulcu H. A semantic triplet based story classifier. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. p. 573-580. 6425707 https://doi.org/10.1109/ASONAM.2012.97
Ceran, Betul ; Karad, Ravi ; Mandvekar, Ajay ; Corman, Steven ; Davulcu, Hasan. / A semantic triplet based story classifier. Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012. 2012. pp. 573-580
@inproceedings{eb2ddd98daa848e2a819c1d5d84b9567,
title = "A semantic triplet based story classifier",
abstract = "A story is defined as {"}an actor(s) taking action(s) that culminates in a resolution(s).{"} In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a {"}story,{"} if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <Subject, Verb, Object> triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.",
author = "Betul Ceran and Ravi Karad and Ajay Mandvekar and Steven Corman and Hasan Davulcu",
year = "2012",
doi = "10.1109/ASONAM.2012.97",
language = "English (US)",
isbn = "9780769547992",
pages = "573--580",
booktitle = "Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012",

}

TY - GEN

T1 - A semantic triplet based story classifier

AU - Ceran, Betul

AU - Karad, Ravi

AU - Mandvekar, Ajay

AU - Corman, Steven

AU - Davulcu, Hasan

PY - 2012

Y1 - 2012

N2 - A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)." In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a "story," if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <Subject, Verb, Object> triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.

AB - A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)." In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a "story," if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <Subject, Verb, Object> triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.

UR - http://www.scopus.com/inward/record.url?scp=84874225265&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874225265&partnerID=8YFLogxK

U2 - 10.1109/ASONAM.2012.97

DO - 10.1109/ASONAM.2012.97

M3 - Conference contribution

SN - 9780769547992

SP - 573

EP - 580

BT - Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012

ER -