Balancing Performance Measures in Classification Using Ensemble Learning Methods

Neeraj Bahl, Ajay Bansal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Ensemble learning methods have recently been widely used in various domains and applications owing to the improvements in computational efficiency and distributed computing advances. However, with the advent of wide variety of applications of machine learning techniques to class imbalance problems, further focus is needed to evaluate, improve and balance other performance measures such as sensitivity (true positive rate) and specificity (true negative rate) in classification. This paper demonstrates an approach to evaluate and balance the performance measures (specifically sensitivity and specificity) using ensemble learning methods for classification that can be especially useful in class imbalanced datasets. In this paper, ensemble learning methods (specifically bagging and boosting) are used to balance the performance measures (sensitivity and specificity) on a diabetes dataset to predict if a patient will be readmitted to the hospital based on various feature vectors. From the experiments conducted, it can be empirically concluded that, by using ensemble learning methods, although accuracy does improve to some margin, both sensitivity and specificity are balanced significantly and consistently over different cross validation approaches.

Original languageEnglish (US)
Title of host publicationBusiness Information Systems - 22nd International Conference, BIS 2019, Proceedings
EditorsWitold Abramowicz, Rafael Corchuelo
PublisherSpringer Verlag
Pages311-324
Number of pages14
ISBN (Print)9783030204815
DOIs
StatePublished - Jan 1 2019
Event22nd International Conference on Business Information Systems, BIS 2019 - Seville, Spain
Duration: Jun 26 2019Jun 28 2019

Publication series

NameLecture Notes in Business Information Processing
Volume354
ISSN (Print)1865-1348

Conference

Conference22nd International Conference on Business Information Systems, BIS 2019
CountrySpain
CitySeville
Period6/26/196/28/19

Fingerprint

Ensemble Learning
Balancing
Performance Measures
Specificity
Distributed computer systems
Medical problems
Computational efficiency
Learning systems
Bagging
Evaluate
Diabetes
Boosting
Distributed Computing
Feature Vector
Cross-validation
Computational Efficiency
Margin
Machine Learning
Experiments
Predict

Keywords

  • Balancing
  • Boosting
  • Classification
  • Ensemble methods

ASJC Scopus subject areas

  • Management Information Systems
  • Control and Systems Engineering
  • Business and International Management
  • Information Systems
  • Modeling and Simulation
  • Information Systems and Management

Cite this

Bahl, N., & Bansal, A. (2019). Balancing Performance Measures in Classification Using Ensemble Learning Methods. In W. Abramowicz, & R. Corchuelo (Eds.), Business Information Systems - 22nd International Conference, BIS 2019, Proceedings (pp. 311-324). (Lecture Notes in Business Information Processing; Vol. 354). Springer Verlag. https://doi.org/10.1007/978-3-030-20482-2_25

Balancing Performance Measures in Classification Using Ensemble Learning Methods. / Bahl, Neeraj; Bansal, Ajay.

Business Information Systems - 22nd International Conference, BIS 2019, Proceedings. ed. / Witold Abramowicz; Rafael Corchuelo. Springer Verlag, 2019. p. 311-324 (Lecture Notes in Business Information Processing; Vol. 354).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bahl, N & Bansal, A 2019, Balancing Performance Measures in Classification Using Ensemble Learning Methods. in W Abramowicz & R Corchuelo (eds), Business Information Systems - 22nd International Conference, BIS 2019, Proceedings. Lecture Notes in Business Information Processing, vol. 354, Springer Verlag, pp. 311-324, 22nd International Conference on Business Information Systems, BIS 2019, Seville, Spain, 6/26/19. https://doi.org/10.1007/978-3-030-20482-2_25
Bahl N, Bansal A. Balancing Performance Measures in Classification Using Ensemble Learning Methods. In Abramowicz W, Corchuelo R, editors, Business Information Systems - 22nd International Conference, BIS 2019, Proceedings. Springer Verlag. 2019. p. 311-324. (Lecture Notes in Business Information Processing). https://doi.org/10.1007/978-3-030-20482-2_25
Bahl, Neeraj ; Bansal, Ajay. / Balancing Performance Measures in Classification Using Ensemble Learning Methods. Business Information Systems - 22nd International Conference, BIS 2019, Proceedings. editor / Witold Abramowicz ; Rafael Corchuelo. Springer Verlag, 2019. pp. 311-324 (Lecture Notes in Business Information Processing).
@inproceedings{752d1aced2a94a7c9ca2344b280fa5b9,
title = "Balancing Performance Measures in Classification Using Ensemble Learning Methods",
abstract = "Ensemble learning methods have recently been widely used in various domains and applications owing to the improvements in computational efficiency and distributed computing advances. However, with the advent of wide variety of applications of machine learning techniques to class imbalance problems, further focus is needed to evaluate, improve and balance other performance measures such as sensitivity (true positive rate) and specificity (true negative rate) in classification. This paper demonstrates an approach to evaluate and balance the performance measures (specifically sensitivity and specificity) using ensemble learning methods for classification that can be especially useful in class imbalanced datasets. In this paper, ensemble learning methods (specifically bagging and boosting) are used to balance the performance measures (sensitivity and specificity) on a diabetes dataset to predict if a patient will be readmitted to the hospital based on various feature vectors. From the experiments conducted, it can be empirically concluded that, by using ensemble learning methods, although accuracy does improve to some margin, both sensitivity and specificity are balanced significantly and consistently over different cross validation approaches.",
keywords = "Balancing, Boosting, Classification, Ensemble methods",
author = "Neeraj Bahl and Ajay Bansal",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-20482-2_25",
language = "English (US)",
isbn = "9783030204815",
series = "Lecture Notes in Business Information Processing",
publisher = "Springer Verlag",
pages = "311--324",
editor = "Witold Abramowicz and Rafael Corchuelo",
booktitle = "Business Information Systems - 22nd International Conference, BIS 2019, Proceedings",

}

TY - GEN

T1 - Balancing Performance Measures in Classification Using Ensemble Learning Methods

AU - Bahl, Neeraj

AU - Bansal, Ajay

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Ensemble learning methods have recently been widely used in various domains and applications owing to the improvements in computational efficiency and distributed computing advances. However, with the advent of wide variety of applications of machine learning techniques to class imbalance problems, further focus is needed to evaluate, improve and balance other performance measures such as sensitivity (true positive rate) and specificity (true negative rate) in classification. This paper demonstrates an approach to evaluate and balance the performance measures (specifically sensitivity and specificity) using ensemble learning methods for classification that can be especially useful in class imbalanced datasets. In this paper, ensemble learning methods (specifically bagging and boosting) are used to balance the performance measures (sensitivity and specificity) on a diabetes dataset to predict if a patient will be readmitted to the hospital based on various feature vectors. From the experiments conducted, it can be empirically concluded that, by using ensemble learning methods, although accuracy does improve to some margin, both sensitivity and specificity are balanced significantly and consistently over different cross validation approaches.

AB - Ensemble learning methods have recently been widely used in various domains and applications owing to the improvements in computational efficiency and distributed computing advances. However, with the advent of wide variety of applications of machine learning techniques to class imbalance problems, further focus is needed to evaluate, improve and balance other performance measures such as sensitivity (true positive rate) and specificity (true negative rate) in classification. This paper demonstrates an approach to evaluate and balance the performance measures (specifically sensitivity and specificity) using ensemble learning methods for classification that can be especially useful in class imbalanced datasets. In this paper, ensemble learning methods (specifically bagging and boosting) are used to balance the performance measures (sensitivity and specificity) on a diabetes dataset to predict if a patient will be readmitted to the hospital based on various feature vectors. From the experiments conducted, it can be empirically concluded that, by using ensemble learning methods, although accuracy does improve to some margin, both sensitivity and specificity are balanced significantly and consistently over different cross validation approaches.

KW - Balancing

KW - Boosting

KW - Classification

KW - Ensemble methods

UR - http://www.scopus.com/inward/record.url?scp=85068141788&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068141788&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-20482-2_25

DO - 10.1007/978-3-030-20482-2_25

M3 - Conference contribution

SN - 9783030204815

T3 - Lecture Notes in Business Information Processing

SP - 311

EP - 324

BT - Business Information Systems - 22nd International Conference, BIS 2019, Proceedings

A2 - Abramowicz, Witold

A2 - Corchuelo, Rafael

PB - Springer Verlag

ER -