Textual signatures: Identifying text-types using latent semantic analysis to measure the cohesion of text structures

Philip M. McCarthy, Stephen W. Briner, Vasile Rus, Danielle McNamara

Research output: Chapter in Book/Report/Conference proceedingChapter

21 Citations (Scopus)

Abstract

Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.

Original languageEnglish (US)
Title of host publicationNatural Language Processing and Text Mining
PublisherSpringer London
Pages107-122
Number of pages16
ISBN (Print)184628175X, 9781846281754
DOIs
StatePublished - 2007
Externally publishedYes

Fingerprint

Semantics

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

McCarthy, P. M., Briner, S. W., Rus, V., & McNamara, D. (2007). Textual signatures: Identifying text-types using latent semantic analysis to measure the cohesion of text structures. In Natural Language Processing and Text Mining (pp. 107-122). Springer London. https://doi.org/10.1007/978-1-84628-754-1_7

Textual signatures : Identifying text-types using latent semantic analysis to measure the cohesion of text structures. / McCarthy, Philip M.; Briner, Stephen W.; Rus, Vasile; McNamara, Danielle.

Natural Language Processing and Text Mining. Springer London, 2007. p. 107-122.

Research output: Chapter in Book/Report/Conference proceedingChapter

McCarthy, PM, Briner, SW, Rus, V & McNamara, D 2007, Textual signatures: Identifying text-types using latent semantic analysis to measure the cohesion of text structures. in Natural Language Processing and Text Mining. Springer London, pp. 107-122. https://doi.org/10.1007/978-1-84628-754-1_7
McCarthy PM, Briner SW, Rus V, McNamara D. Textual signatures: Identifying text-types using latent semantic analysis to measure the cohesion of text structures. In Natural Language Processing and Text Mining. Springer London. 2007. p. 107-122 https://doi.org/10.1007/978-1-84628-754-1_7
McCarthy, Philip M. ; Briner, Stephen W. ; Rus, Vasile ; McNamara, Danielle. / Textual signatures : Identifying text-types using latent semantic analysis to measure the cohesion of text structures. Natural Language Processing and Text Mining. Springer London, 2007. pp. 107-122
@inbook{72a75559f3404950bdb8c58107087051,
title = "Textual signatures: Identifying text-types using latent semantic analysis to measure the cohesion of text structures",
abstract = "Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.",
author = "McCarthy, {Philip M.} and Briner, {Stephen W.} and Vasile Rus and Danielle McNamara",
year = "2007",
doi = "10.1007/978-1-84628-754-1_7",
language = "English (US)",
isbn = "184628175X",
pages = "107--122",
booktitle = "Natural Language Processing and Text Mining",
publisher = "Springer London",

}

TY - CHAP

T1 - Textual signatures

T2 - Identifying text-types using latent semantic analysis to measure the cohesion of text structures

AU - McCarthy, Philip M.

AU - Briner, Stephen W.

AU - Rus, Vasile

AU - McNamara, Danielle

PY - 2007

Y1 - 2007

N2 - Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.

AB - Just as a sentence is far more than a mere concatenation of words, a text is far more than a mere concatenation of sentences. Texts contain pertinent information that co-refers across sentences and paragraphs [30]; texts contain relations between phrases, clauses, and sentences that are often causally linked [21], [51], [56]; and texts that depend on relating a series of chronological events contain temporal features that help the reader to build a coherent representation of the text [19], [55]. We refer to textual features such as these as cohesive elements, and they occur within paragraphs (locally), across paragraphs (globally), and in forms such as referential, causal, temporal, and structural [18], [22], [36]. But cohesive elements, and by consequence cohesion, does not simply feature in a text as dialogues tend to feature in narratives, or as cartoons tend to feature in newspapers. That is, cohesion is not present or absent in a binary or optional sense. Instead, cohesion in text exists on a continuum of presence, which is sometimes indicative of the text-type in question [12], [37], [41] and sometimes indicative of the audience for which the text was written [44], [47]. In this chapter, we discuss the nature and importance of cohesion; we demonstrate a computational tool that measures cohesion; and, most importantly, we demonstrate a novel approach to identifying text-types by incorporating contrasting rates of cohesion.

UR - http://www.scopus.com/inward/record.url?scp=84890215811&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890215811&partnerID=8YFLogxK

U2 - 10.1007/978-1-84628-754-1_7

DO - 10.1007/978-1-84628-754-1_7

M3 - Chapter

AN - SCOPUS:84890215811

SN - 184628175X

SN - 9781846281754

SP - 107

EP - 122

BT - Natural Language Processing and Text Mining

PB - Springer London

ER -