I say, you say, we say: Using spoken language to model socio-cognitive processes during computer-supported collaborative problem solving

Angela E.B. Stewart, Hana Vrzakova, Chen Sun, Jade Yonehiro, Cathlyn Adele Stone, Nicholas D. Duran, Valerie Shute, Sidney K. D’Mello

Research output: Contribution to journalArticle

Abstract

Collaborative problem solving (CPS) is a crucial 21st century skill; however, current technologies fall short of effectively supporting CPS processes, especially for remote, computer-enabled interactions. In order to develop next-generation computer-supported collaborative systems that enhance CPS processes and outcomes by monitoring and responding to the unfolding collaboration, we investigate automated detection of three critical CPS process – construction of shared knowledge, negotiation/coordination, and maintaining team function – derived from a validated CPS framework. Our data consists of 32 triads who were tasked with collaboratively solving a challenging visual computer programming task for 20 minutes using commercial videoconferencing software. We used automatic speech recognition to generate transcripts of 11,163 utterances, which trained humans coded for evidence of the above three CPS processes using a set of behavioral indicators. We aimed to automate the trained human-raters’ codes in a team-independent fashion (current study) in order to provide automatic real-time or offline feedback (future work). We used Random Forest classifiers trained on the words themselves (bag of n-grams) or with word categories (e.g., emotions, thinking styles, social constructs) from the Linguistic Inquiry Word Count (LIWC) tool. Despite imperfect automatic speech recognition, the n-gram models achieved AUROC (area under the receiver operating characteristic curve) scores of .85, .77, and .77 for construction of shared knowledge, negotiation/coordination, and maintaining team function, respectively; these reflect 70%, 54%, and 54% improvements over chance. The LIWC-category models achieved similar scores of .82, .74, and .73 (64%, 48%, and 46% improvement over chance). Further, the LIWC model-derived scores predicted CPS outcomes more similar to human codes, demonstrating predictive validity. We discuss embedding our models in collaborative interfaces for assessment and dynamic intervention aimed at improving CPS outcomes.

Original languageEnglish (US)
Article number194
JournalProceedings of the ACM on Human-Computer Interaction
Volume3
Issue numberCSCW
DOIs
StatePublished - Nov 2019
Externally publishedYes

Fingerprint

spoken language
Linguistics
linguistics
Speech recognition
Computer programming
Classifiers
emotion
recipient
programming
monitoring
Feedback
Monitoring
interaction
evidence

Keywords

  • Collaborative interfaces
  • Collaborative problem solving
  • Language analysis

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Human-Computer Interaction
  • Computer Networks and Communications

Cite this

I say, you say, we say : Using spoken language to model socio-cognitive processes during computer-supported collaborative problem solving. / Stewart, Angela E.B.; Vrzakova, Hana; Sun, Chen; Yonehiro, Jade; Stone, Cathlyn Adele; Duran, Nicholas D.; Shute, Valerie; D’Mello, Sidney K.

In: Proceedings of the ACM on Human-Computer Interaction, Vol. 3, No. CSCW, 194, 11.2019.

Research output: Contribution to journalArticle

Stewart, Angela E.B. ; Vrzakova, Hana ; Sun, Chen ; Yonehiro, Jade ; Stone, Cathlyn Adele ; Duran, Nicholas D. ; Shute, Valerie ; D’Mello, Sidney K. / I say, you say, we say : Using spoken language to model socio-cognitive processes during computer-supported collaborative problem solving. In: Proceedings of the ACM on Human-Computer Interaction. 2019 ; Vol. 3, No. CSCW.
@article{734f07677ed044cb81b0e6bced1d7f9e,
title = "I say, you say, we say: Using spoken language to model socio-cognitive processes during computer-supported collaborative problem solving",
abstract = "Collaborative problem solving (CPS) is a crucial 21st century skill; however, current technologies fall short of effectively supporting CPS processes, especially for remote, computer-enabled interactions. In order to develop next-generation computer-supported collaborative systems that enhance CPS processes and outcomes by monitoring and responding to the unfolding collaboration, we investigate automated detection of three critical CPS process – construction of shared knowledge, negotiation/coordination, and maintaining team function – derived from a validated CPS framework. Our data consists of 32 triads who were tasked with collaboratively solving a challenging visual computer programming task for 20 minutes using commercial videoconferencing software. We used automatic speech recognition to generate transcripts of 11,163 utterances, which trained humans coded for evidence of the above three CPS processes using a set of behavioral indicators. We aimed to automate the trained human-raters’ codes in a team-independent fashion (current study) in order to provide automatic real-time or offline feedback (future work). We used Random Forest classifiers trained on the words themselves (bag of n-grams) or with word categories (e.g., emotions, thinking styles, social constructs) from the Linguistic Inquiry Word Count (LIWC) tool. Despite imperfect automatic speech recognition, the n-gram models achieved AUROC (area under the receiver operating characteristic curve) scores of .85, .77, and .77 for construction of shared knowledge, negotiation/coordination, and maintaining team function, respectively; these reflect 70{\%}, 54{\%}, and 54{\%} improvements over chance. The LIWC-category models achieved similar scores of .82, .74, and .73 (64{\%}, 48{\%}, and 46{\%} improvement over chance). Further, the LIWC model-derived scores predicted CPS outcomes more similar to human codes, demonstrating predictive validity. We discuss embedding our models in collaborative interfaces for assessment and dynamic intervention aimed at improving CPS outcomes.",
keywords = "Collaborative interfaces, Collaborative problem solving, Language analysis",
author = "Stewart, {Angela E.B.} and Hana Vrzakova and Chen Sun and Jade Yonehiro and Stone, {Cathlyn Adele} and Duran, {Nicholas D.} and Valerie Shute and D’Mello, {Sidney K.}",
year = "2019",
month = "11",
doi = "10.1145/3359296",
language = "English (US)",
volume = "3",
journal = "Proceedings of the ACM on Human-Computer Interaction",
issn = "2573-0142",
publisher = "Association for Computing Machinery (ACM)",
number = "CSCW",

}

TY - JOUR

T1 - I say, you say, we say

T2 - Using spoken language to model socio-cognitive processes during computer-supported collaborative problem solving

AU - Stewart, Angela E.B.

AU - Vrzakova, Hana

AU - Sun, Chen

AU - Yonehiro, Jade

AU - Stone, Cathlyn Adele

AU - Duran, Nicholas D.

AU - Shute, Valerie

AU - D’Mello, Sidney K.

PY - 2019/11

Y1 - 2019/11

N2 - Collaborative problem solving (CPS) is a crucial 21st century skill; however, current technologies fall short of effectively supporting CPS processes, especially for remote, computer-enabled interactions. In order to develop next-generation computer-supported collaborative systems that enhance CPS processes and outcomes by monitoring and responding to the unfolding collaboration, we investigate automated detection of three critical CPS process – construction of shared knowledge, negotiation/coordination, and maintaining team function – derived from a validated CPS framework. Our data consists of 32 triads who were tasked with collaboratively solving a challenging visual computer programming task for 20 minutes using commercial videoconferencing software. We used automatic speech recognition to generate transcripts of 11,163 utterances, which trained humans coded for evidence of the above three CPS processes using a set of behavioral indicators. We aimed to automate the trained human-raters’ codes in a team-independent fashion (current study) in order to provide automatic real-time or offline feedback (future work). We used Random Forest classifiers trained on the words themselves (bag of n-grams) or with word categories (e.g., emotions, thinking styles, social constructs) from the Linguistic Inquiry Word Count (LIWC) tool. Despite imperfect automatic speech recognition, the n-gram models achieved AUROC (area under the receiver operating characteristic curve) scores of .85, .77, and .77 for construction of shared knowledge, negotiation/coordination, and maintaining team function, respectively; these reflect 70%, 54%, and 54% improvements over chance. The LIWC-category models achieved similar scores of .82, .74, and .73 (64%, 48%, and 46% improvement over chance). Further, the LIWC model-derived scores predicted CPS outcomes more similar to human codes, demonstrating predictive validity. We discuss embedding our models in collaborative interfaces for assessment and dynamic intervention aimed at improving CPS outcomes.

AB - Collaborative problem solving (CPS) is a crucial 21st century skill; however, current technologies fall short of effectively supporting CPS processes, especially for remote, computer-enabled interactions. In order to develop next-generation computer-supported collaborative systems that enhance CPS processes and outcomes by monitoring and responding to the unfolding collaboration, we investigate automated detection of three critical CPS process – construction of shared knowledge, negotiation/coordination, and maintaining team function – derived from a validated CPS framework. Our data consists of 32 triads who were tasked with collaboratively solving a challenging visual computer programming task for 20 minutes using commercial videoconferencing software. We used automatic speech recognition to generate transcripts of 11,163 utterances, which trained humans coded for evidence of the above three CPS processes using a set of behavioral indicators. We aimed to automate the trained human-raters’ codes in a team-independent fashion (current study) in order to provide automatic real-time or offline feedback (future work). We used Random Forest classifiers trained on the words themselves (bag of n-grams) or with word categories (e.g., emotions, thinking styles, social constructs) from the Linguistic Inquiry Word Count (LIWC) tool. Despite imperfect automatic speech recognition, the n-gram models achieved AUROC (area under the receiver operating characteristic curve) scores of .85, .77, and .77 for construction of shared knowledge, negotiation/coordination, and maintaining team function, respectively; these reflect 70%, 54%, and 54% improvements over chance. The LIWC-category models achieved similar scores of .82, .74, and .73 (64%, 48%, and 46% improvement over chance). Further, the LIWC model-derived scores predicted CPS outcomes more similar to human codes, demonstrating predictive validity. We discuss embedding our models in collaborative interfaces for assessment and dynamic intervention aimed at improving CPS outcomes.

KW - Collaborative interfaces

KW - Collaborative problem solving

KW - Language analysis

UR - http://www.scopus.com/inward/record.url?scp=85075037374&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85075037374&partnerID=8YFLogxK

U2 - 10.1145/3359296

DO - 10.1145/3359296

M3 - Article

AN - SCOPUS:85075037374

VL - 3

JO - Proceedings of the ACM on Human-Computer Interaction

JF - Proceedings of the ACM on Human-Computer Interaction

SN - 2573-0142

IS - CSCW

M1 - 194

ER -