Objective assessment of social skills using automated language analysis for identification of schizophrenia and bipolar disorder

Rohit Voleti, Stephanie Woolridge, Julie M. Liss, Melissa Milanovic, Christopher R. Bowie, Visar Berisha

Research output: Contribution to journalConference article

Abstract

Several studies have shown that speech and language features, automatically extracted from clinical interviews or spontaneous discourse, have diagnostic value for mental disorders such as schizophrenia and bipolar disorder. They typically make use of a large feature set to train a classifier for distinguishing between two groups of interest, i.e. a clinical and control group. However, a purely data-driven approach runs the risk of overfitting to a particular data set, especially when sample sizes are limited. Here, we first down-select the set of language features to a small subset that is related to a well-validated test of functional ability, the Social Skills Performance Assessment (SSPA). This helps establish the concurrent validity of the selected features. We use only these features to train a simple classifier to distinguish between groups of interest. Linear regression reveals that a subset of language features can effectively model the SSPA, with a correlation coefficient of 0.75. Furthermore, the same feature set can be used to build a strong binary classifier to distinguish between healthy controls and a clinical group (AUC = 0.96) and also between patients within the clinical group with schizophrenia and bipolar I disorder (AUC = 0.83).

Original languageEnglish (US)
Pages (from-to)1433-1437
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2019-September
DOIs
StatePublished - Jan 1 2019
Externally publishedYes
Event20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
Duration: Sep 15 2019Sep 19 2019

Fingerprint

Disorder
Classifiers
Performance Assessment
Classifier
Linear regression
Subset
Overfitting
Data-driven
Correlation coefficient
Concurrent
Diagnostics
Sample Size
Skills
Language
Bipolar Disorder
Schizophrenia
Social Skills
Binary
Train
Model

Keywords

  • Bipolar disorder
  • Computational linguistics
  • Natural language processing
  • Schizophrenia
  • Semantic coherence

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Cite this

Objective assessment of social skills using automated language analysis for identification of schizophrenia and bipolar disorder. / Voleti, Rohit; Woolridge, Stephanie; Liss, Julie M.; Milanovic, Melissa; Bowie, Christopher R.; Berisha, Visar.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2019-September, 01.01.2019, p. 1433-1437.

Research output: Contribution to journalConference article

@article{0a5e8b0ccca24a6a91d94a3392dc3fa5,
title = "Objective assessment of social skills using automated language analysis for identification of schizophrenia and bipolar disorder",
abstract = "Several studies have shown that speech and language features, automatically extracted from clinical interviews or spontaneous discourse, have diagnostic value for mental disorders such as schizophrenia and bipolar disorder. They typically make use of a large feature set to train a classifier for distinguishing between two groups of interest, i.e. a clinical and control group. However, a purely data-driven approach runs the risk of overfitting to a particular data set, especially when sample sizes are limited. Here, we first down-select the set of language features to a small subset that is related to a well-validated test of functional ability, the Social Skills Performance Assessment (SSPA). This helps establish the concurrent validity of the selected features. We use only these features to train a simple classifier to distinguish between groups of interest. Linear regression reveals that a subset of language features can effectively model the SSPA, with a correlation coefficient of 0.75. Furthermore, the same feature set can be used to build a strong binary classifier to distinguish between healthy controls and a clinical group (AUC = 0.96) and also between patients within the clinical group with schizophrenia and bipolar I disorder (AUC = 0.83).",
keywords = "Bipolar disorder, Computational linguistics, Natural language processing, Schizophrenia, Semantic coherence",
author = "Rohit Voleti and Stephanie Woolridge and Liss, {Julie M.} and Melissa Milanovic and Bowie, {Christopher R.} and Visar Berisha",
year = "2019",
month = "1",
day = "1",
doi = "10.21437/Interspeech.2019-2960",
language = "English (US)",
volume = "2019-September",
pages = "1433--1437",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Objective assessment of social skills using automated language analysis for identification of schizophrenia and bipolar disorder

AU - Voleti, Rohit

AU - Woolridge, Stephanie

AU - Liss, Julie M.

AU - Milanovic, Melissa

AU - Bowie, Christopher R.

AU - Berisha, Visar

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Several studies have shown that speech and language features, automatically extracted from clinical interviews or spontaneous discourse, have diagnostic value for mental disorders such as schizophrenia and bipolar disorder. They typically make use of a large feature set to train a classifier for distinguishing between two groups of interest, i.e. a clinical and control group. However, a purely data-driven approach runs the risk of overfitting to a particular data set, especially when sample sizes are limited. Here, we first down-select the set of language features to a small subset that is related to a well-validated test of functional ability, the Social Skills Performance Assessment (SSPA). This helps establish the concurrent validity of the selected features. We use only these features to train a simple classifier to distinguish between groups of interest. Linear regression reveals that a subset of language features can effectively model the SSPA, with a correlation coefficient of 0.75. Furthermore, the same feature set can be used to build a strong binary classifier to distinguish between healthy controls and a clinical group (AUC = 0.96) and also between patients within the clinical group with schizophrenia and bipolar I disorder (AUC = 0.83).

AB - Several studies have shown that speech and language features, automatically extracted from clinical interviews or spontaneous discourse, have diagnostic value for mental disorders such as schizophrenia and bipolar disorder. They typically make use of a large feature set to train a classifier for distinguishing between two groups of interest, i.e. a clinical and control group. However, a purely data-driven approach runs the risk of overfitting to a particular data set, especially when sample sizes are limited. Here, we first down-select the set of language features to a small subset that is related to a well-validated test of functional ability, the Social Skills Performance Assessment (SSPA). This helps establish the concurrent validity of the selected features. We use only these features to train a simple classifier to distinguish between groups of interest. Linear regression reveals that a subset of language features can effectively model the SSPA, with a correlation coefficient of 0.75. Furthermore, the same feature set can be used to build a strong binary classifier to distinguish between healthy controls and a clinical group (AUC = 0.96) and also between patients within the clinical group with schizophrenia and bipolar I disorder (AUC = 0.83).

KW - Bipolar disorder

KW - Computational linguistics

KW - Natural language processing

KW - Schizophrenia

KW - Semantic coherence

UR - http://www.scopus.com/inward/record.url?scp=85074706830&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074706830&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2019-2960

DO - 10.21437/Interspeech.2019-2960

M3 - Conference article

AN - SCOPUS:85074706830

VL - 2019-September

SP - 1433

EP - 1437

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -