CSaRUS-CNN at AMIA-2017 Tasks 1, 2: Under sampled CNN for text classification

Arjun Magge, Matthew Scotch, Graciela Gonzalez

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Most practical text classification tasks in natural language processing involve training sets where the number of training instances belonging to each of the classes are not equal. The performance of the classifier in such a case can be affected by the sampling strategies used in training. In this work, we describe a cost sensitive and random undersampling variants of convolutional neural networks (CNNs) for classifying texts in imbalanced datasets and analyze its results. The classifier proposed in this paper achieves a maximum F1-score of 0.414 placing 2nd on the ADR dataset and achieves a maximum F1-score of 0.652 placing 6th on the medication intake dataset.

Original languageEnglish (US)
Pages (from-to)76-78
Number of pages3
JournalCEUR Workshop Proceedings
Volume1996
StatePublished - Jan 1 2017

Fingerprint

Classifiers
Neural networks
Sampling
Processing
Costs

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

CSaRUS-CNN at AMIA-2017 Tasks 1, 2 : Under sampled CNN for text classification. / Magge, Arjun; Scotch, Matthew; Gonzalez, Graciela.

In: CEUR Workshop Proceedings, Vol. 1996, 01.01.2017, p. 76-78.

Research output: Contribution to journalArticle

@article{ccecaac044f440f1a09ea389e85b2d27,
title = "CSaRUS-CNN at AMIA-2017 Tasks 1, 2: Under sampled CNN for text classification",
abstract = "Most practical text classification tasks in natural language processing involve training sets where the number of training instances belonging to each of the classes are not equal. The performance of the classifier in such a case can be affected by the sampling strategies used in training. In this work, we describe a cost sensitive and random undersampling variants of convolutional neural networks (CNNs) for classifying texts in imbalanced datasets and analyze its results. The classifier proposed in this paper achieves a maximum F1-score of 0.414 placing 2nd on the ADR dataset and achieves a maximum F1-score of 0.652 placing 6th on the medication intake dataset.",
author = "Arjun Magge and Matthew Scotch and Graciela Gonzalez",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "1996",
pages = "76--78",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "CEUR-WS",

}

TY - JOUR

T1 - CSaRUS-CNN at AMIA-2017 Tasks 1, 2

T2 - Under sampled CNN for text classification

AU - Magge, Arjun

AU - Scotch, Matthew

AU - Gonzalez, Graciela

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Most practical text classification tasks in natural language processing involve training sets where the number of training instances belonging to each of the classes are not equal. The performance of the classifier in such a case can be affected by the sampling strategies used in training. In this work, we describe a cost sensitive and random undersampling variants of convolutional neural networks (CNNs) for classifying texts in imbalanced datasets and analyze its results. The classifier proposed in this paper achieves a maximum F1-score of 0.414 placing 2nd on the ADR dataset and achieves a maximum F1-score of 0.652 placing 6th on the medication intake dataset.

AB - Most practical text classification tasks in natural language processing involve training sets where the number of training instances belonging to each of the classes are not equal. The performance of the classifier in such a case can be affected by the sampling strategies used in training. In this work, we describe a cost sensitive and random undersampling variants of convolutional neural networks (CNNs) for classifying texts in imbalanced datasets and analyze its results. The classifier proposed in this paper achieves a maximum F1-score of 0.414 placing 2nd on the ADR dataset and achieves a maximum F1-score of 0.652 placing 6th on the medication intake dataset.

UR - http://www.scopus.com/inward/record.url?scp=85037044221&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037044221&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85037044221

VL - 1996

SP - 76

EP - 78

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

ER -