Attend and diagnose

Clinical time series analysis using attention models

Huan Song, Deepta Rajan, Jayaraman J. Thiagarajan, Andreas Spanias

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the SAnD (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of SAnD to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.

Original languageEnglish (US)
Title of host publication32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PublisherAAAI press
Pages4091-4098
Number of pages8
ISBN (Electronic)9781577358008
StatePublished - Jan 1 2018
Event32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
Duration: Feb 2 2018Feb 7 2018

Other

Other32nd AAAI Conference on Artificial Intelligence, AAAI 2018
CountryUnited States
CityNew Orleans
Period2/2/182/7/18

Fingerprint

Time series analysis
Time series
Recurrent neural networks
Network architecture
Interpolation
Health
Processing
Long short-term memory

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Song, H., Rajan, D., Thiagarajan, J. J., & Spanias, A. (2018). Attend and diagnose: Clinical time series analysis using attention models. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 4091-4098). AAAI press.

Attend and diagnose : Clinical time series analysis using attention models. / Song, Huan; Rajan, Deepta; Thiagarajan, Jayaraman J.; Spanias, Andreas.

32nd AAAI Conference on Artificial Intelligence, AAAI 2018. AAAI press, 2018. p. 4091-4098.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Song, H, Rajan, D, Thiagarajan, JJ & Spanias, A 2018, Attend and diagnose: Clinical time series analysis using attention models. in 32nd AAAI Conference on Artificial Intelligence, AAAI 2018. AAAI press, pp. 4091-4098, 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, New Orleans, United States, 2/2/18.
Song H, Rajan D, Thiagarajan JJ, Spanias A. Attend and diagnose: Clinical time series analysis using attention models. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018. AAAI press. 2018. p. 4091-4098
Song, Huan ; Rajan, Deepta ; Thiagarajan, Jayaraman J. ; Spanias, Andreas. / Attend and diagnose : Clinical time series analysis using attention models. 32nd AAAI Conference on Artificial Intelligence, AAAI 2018. AAAI press, 2018. pp. 4091-4098
@inproceedings{2304ed73e858419398e3ee1508af5825,
title = "Attend and diagnose: Clinical time series analysis using attention models",
abstract = "With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the SAnD (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of SAnD to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.",
author = "Huan Song and Deepta Rajan and Thiagarajan, {Jayaraman J.} and Andreas Spanias",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
pages = "4091--4098",
booktitle = "32nd AAAI Conference on Artificial Intelligence, AAAI 2018",
publisher = "AAAI press",

}

TY - GEN

T1 - Attend and diagnose

T2 - Clinical time series analysis using attention models

AU - Song, Huan

AU - Rajan, Deepta

AU - Thiagarajan, Jayaraman J.

AU - Spanias, Andreas

PY - 2018/1/1

Y1 - 2018/1/1

N2 - With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the SAnD (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of SAnD to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.

AB - With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the SAnD (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of SAnD to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.

UR - http://www.scopus.com/inward/record.url?scp=85057389495&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057389495&partnerID=8YFLogxK

M3 - Conference contribution

SP - 4091

EP - 4098

BT - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018

PB - AAAI press

ER -