Diagnosis of coronary artery disease using cost-sensitive algorithms

Roohallah Alizadehsani; Mohammad Javad Hosseini; Zahra Alizadeh Sani; Asma Ghandeharioun; Reihane Boghrati

doi:10.1109/ICDMW.2012.29

Diagnosis of coronary artery disease using cost-sensitive algorithms

Roohallah Alizadehsani, Mohammad Javad Hosseini, Zahra Alizadeh Sani, Asma Ghandeharioun, Reihane Boghrati

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

76 Scopus citations

Abstract

One of the main causes of death the world over are cardiovascular diseases, of which coronary artery disease (CAD) is a major type. This disease occurs when the diameter narrowing of one of the left anterior descending, left circumflex, or right coronary arteries is equal to or greater than 50 percent. Angiography is the principal diagnostic modality for the stenosis of heart vessels; however, because of its complications and costs, researchers are looking for alternative methods such as data mining. This study conducts data mining algorithms on the Z-Alizadeh Sani dataset which has been collected from 303 random visitors to Tehran's Shaheed Rajaei Cardiovascular, Medical and Research Center. In this paper, the reason of effectiveness of a preprocessing algorithm on the dataset is investigated. This algorithm which has been merely introduced in our previous works, extracts three new features from the dataset. These features are then used to enrich the primary dataset in order to achieve more accurate results. Moreover, despite the fact that misclassification of diseased patients has more side effects than that of healthy ones, to the best of our knowledge cost-sensitive algorithms have yet to be used in this field. Therefore, in this paper 10-fold cross validation on cost-sensitive algorithms along with base classifiers of Naïve Bayes, Sequential Minimal Optimization (SMO), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and C4.5 were employed. As a result, the SMO algorithm has yield to very high sensitivity (97.22%) and accuracy (92.09%) rates, the likes of which have not been reported simultaneously in the existing literature.

Original language	English (US)
Title of host publication	Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Pages	9-16
Number of pages	8
DOIs	https://doi.org/10.1109/ICDMW.2012.29
State	Published - 2012
Externally published	Yes
Event	12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 - Brussels, Belgium Duration: Dec 10 2012 → Dec 10 2012

Publication series

Name	Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012

Other

Other	12th IEEE International Conference on Data Mining Workshops, ICDMW 2012
Country/Territory	Belgium
City	Brussels
Period	12/10/12 → 12/10/12

Keywords

C4.5 algorithm
Component
Coronary Artery Disease
Cost Sensitive Algorithms
Data Mining
Feature Extraction
Naïve Bayes algorithm

ASJC Scopus subject areas

Software

Access to Document

10.1109/ICDMW.2012.29

Cite this

Alizadehsani, R., Hosseini, M. J., Sani, Z. A., Ghandeharioun, A., & Boghrati, R. (2012). Diagnosis of coronary artery disease using cost-sensitive algorithms. In Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 (pp. 9-16). Article 6406417 (Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012). https://doi.org/10.1109/ICDMW.2012.29

Diagnosis of coronary artery disease using cost-sensitive algorithms. / Alizadehsani, Roohallah; Hosseini, Mohammad Javad; Sani, Zahra Alizadeh et al.
Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012. 2012. p. 9-16 6406417 (Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Alizadehsani, R, Hosseini, MJ, Sani, ZA, Ghandeharioun, A & Boghrati, R 2012, Diagnosis of coronary artery disease using cost-sensitive algorithms. in Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012., 6406417, Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012, pp. 9-16, 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012, Brussels, Belgium, 12/10/12. https://doi.org/10.1109/ICDMW.2012.29

@inproceedings{9e0669ecc72f48b1806d62f3cb3c5bfc,

title = "Diagnosis of coronary artery disease using cost-sensitive algorithms",

abstract = "One of the main causes of death the world over are cardiovascular diseases, of which coronary artery disease (CAD) is a major type. This disease occurs when the diameter narrowing of one of the left anterior descending, left circumflex, or right coronary arteries is equal to or greater than 50 percent. Angiography is the principal diagnostic modality for the stenosis of heart vessels; however, because of its complications and costs, researchers are looking for alternative methods such as data mining. This study conducts data mining algorithms on the Z-Alizadeh Sani dataset which has been collected from 303 random visitors to Tehran's Shaheed Rajaei Cardiovascular, Medical and Research Center. In this paper, the reason of effectiveness of a preprocessing algorithm on the dataset is investigated. This algorithm which has been merely introduced in our previous works, extracts three new features from the dataset. These features are then used to enrich the primary dataset in order to achieve more accurate results. Moreover, despite the fact that misclassification of diseased patients has more side effects than that of healthy ones, to the best of our knowledge cost-sensitive algorithms have yet to be used in this field. Therefore, in this paper 10-fold cross validation on cost-sensitive algorithms along with base classifiers of Na{\"i}ve Bayes, Sequential Minimal Optimization (SMO), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and C4.5 were employed. As a result, the SMO algorithm has yield to very high sensitivity (97.22%) and accuracy (92.09%) rates, the likes of which have not been reported simultaneously in the existing literature.",

keywords = "C4.5 algorithm, Component, Coronary Artery Disease, Cost Sensitive Algorithms, Data Mining, Feature Extraction, Na{\"i}ve Bayes algorithm",

author = "Roohallah Alizadehsani and Hosseini, {Mohammad Javad} and Sani, {Zahra Alizadeh} and Asma Ghandeharioun and Reihane Boghrati",

year = "2012",

doi = "10.1109/ICDMW.2012.29",

language = "English (US)",

isbn = "9780769549255",

series = "Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012",

pages = "9--16",

booktitle = "Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012",

note = "12th IEEE International Conference on Data Mining Workshops, ICDMW 2012 ; Conference date: 10-12-2012 Through 10-12-2012",

}

TY - GEN

T1 - Diagnosis of coronary artery disease using cost-sensitive algorithms

AU - Alizadehsani, Roohallah

AU - Hosseini, Mohammad Javad

AU - Sani, Zahra Alizadeh

AU - Ghandeharioun, Asma

AU - Boghrati, Reihane

PY - 2012

Y1 - 2012

N2 - One of the main causes of death the world over are cardiovascular diseases, of which coronary artery disease (CAD) is a major type. This disease occurs when the diameter narrowing of one of the left anterior descending, left circumflex, or right coronary arteries is equal to or greater than 50 percent. Angiography is the principal diagnostic modality for the stenosis of heart vessels; however, because of its complications and costs, researchers are looking for alternative methods such as data mining. This study conducts data mining algorithms on the Z-Alizadeh Sani dataset which has been collected from 303 random visitors to Tehran's Shaheed Rajaei Cardiovascular, Medical and Research Center. In this paper, the reason of effectiveness of a preprocessing algorithm on the dataset is investigated. This algorithm which has been merely introduced in our previous works, extracts three new features from the dataset. These features are then used to enrich the primary dataset in order to achieve more accurate results. Moreover, despite the fact that misclassification of diseased patients has more side effects than that of healthy ones, to the best of our knowledge cost-sensitive algorithms have yet to be used in this field. Therefore, in this paper 10-fold cross validation on cost-sensitive algorithms along with base classifiers of Naïve Bayes, Sequential Minimal Optimization (SMO), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and C4.5 were employed. As a result, the SMO algorithm has yield to very high sensitivity (97.22%) and accuracy (92.09%) rates, the likes of which have not been reported simultaneously in the existing literature.

AB - One of the main causes of death the world over are cardiovascular diseases, of which coronary artery disease (CAD) is a major type. This disease occurs when the diameter narrowing of one of the left anterior descending, left circumflex, or right coronary arteries is equal to or greater than 50 percent. Angiography is the principal diagnostic modality for the stenosis of heart vessels; however, because of its complications and costs, researchers are looking for alternative methods such as data mining. This study conducts data mining algorithms on the Z-Alizadeh Sani dataset which has been collected from 303 random visitors to Tehran's Shaheed Rajaei Cardiovascular, Medical and Research Center. In this paper, the reason of effectiveness of a preprocessing algorithm on the dataset is investigated. This algorithm which has been merely introduced in our previous works, extracts three new features from the dataset. These features are then used to enrich the primary dataset in order to achieve more accurate results. Moreover, despite the fact that misclassification of diseased patients has more side effects than that of healthy ones, to the best of our knowledge cost-sensitive algorithms have yet to be used in this field. Therefore, in this paper 10-fold cross validation on cost-sensitive algorithms along with base classifiers of Naïve Bayes, Sequential Minimal Optimization (SMO), K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and C4.5 were employed. As a result, the SMO algorithm has yield to very high sensitivity (97.22%) and accuracy (92.09%) rates, the likes of which have not been reported simultaneously in the existing literature.

KW - C4.5 algorithm

KW - Component

KW - Coronary Artery Disease

KW - Cost Sensitive Algorithms

KW - Data Mining

KW - Feature Extraction

KW - Naïve Bayes algorithm

UR - http://www.scopus.com/inward/record.url?scp=84873135633&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873135633&partnerID=8YFLogxK

U2 - 10.1109/ICDMW.2012.29

DO - 10.1109/ICDMW.2012.29

M3 - Conference contribution

AN - SCOPUS:84873135633

SN - 9780769549255

T3 - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012

SP - 9

EP - 16

BT - Proceedings - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012

T2 - 12th IEEE International Conference on Data Mining Workshops, ICDMW 2012

Y2 - 10 December 2012 through 10 December 2012

ER -

Diagnosis of coronary artery disease using cost-sensitive algorithms

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this