Abstract

A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)." In this paper, we investigate the utility of standard keyword based features, statistical features based on shallow-parsing (such as density of POS tags and named entities), and a new set of semantic features to develop a story classifier. This classifier is trained to identify a paragraph as a "story," if the paragraph contains mostly story(ies). Training data is a collection of expert-coded story and non-story paragraphs from RSS feeds from a list of extremist web sites. Our proposed semantic features are based on suitable aggregation and generalization of <Subject, Verb, Object> triplets that can be extracted using a parser. Experimental results show that a model of statistical features alongside memory-based semantic linguistic features achieves the best accuracy with a Support Vector Machine (SVM) classifier.

Original languageEnglish (US)
Title of host publicationProceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
Pages573-580
Number of pages8
DOIs
StatePublished - Dec 1 2012
Event2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012 - Istanbul, Turkey
Duration: Aug 26 2012Aug 29 2012

Publication series

NameProceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012

Other

Other2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2012
Country/TerritoryTurkey
CityIstanbul
Period8/26/128/29/12

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'A semantic triplet based story classifier'. Together they form a unique fingerprint.

Cite this