Interpretable regularized class association rules algorithm for classification in a categorical data space

Mohamed Azmi; George C. Runger; Abdelaziz Berrado

doi:10.1016/j.ins.2019.01.047

Interpretable regularized class association rules algorithm for classification in a categorical data space

Mohamed Azmi, George C. Runger, Abdelaziz Berrado

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Article › peer-review

43 Scopus citations

Abstract

Using association rules in classification is a great success which produces high accuracy classifiers. Even so, the principal advantage of the associative classifiers lies in interpretation. However, pruning the useless rules among the huge set of the mined rules as well as combining them to build a classifier remains a subject for improvement and further research. In this paper, we introduce a new algorithm to build a classifier based on Regularized Class Association Rules in a categorical data space called RCAR. The characteristic of this algorithm is, therefore, threefold: First, mining an exhaustive set of Class Association Rules (CARs) according to a predefined values of support and confidence thresholds. Second, applying a regularized logistic regression algorithm with Lasso penalty on the rules space to build a model that predicts the conditional probability of the existence of the outcome. Useless rules are pruned thanks to the selective nature of Lasso regularization. Third, organizing and visualizing the CARs which survive the first step of pruning by Lasso regularization using metarules. An optional step of pruning could be undertaken on the basis of the metarules and subject knowledge. Likewise, the empirical results indicate that RCAR gives comparable accuracy against Random Forest and GBM.

Original language	English (US)
Pages (from-to)	313-331
Number of pages	19
Journal	Information Sciences
Volume	483
DOIs	https://doi.org/10.1016/j.ins.2019.01.047
State	Published - May 2019

Keywords

Association rules
Class association rules
Classification
Ensemble learning
Pruning
Regularization

ASJC Scopus subject areas

Software
Control and Systems Engineering
Theoretical Computer Science
Computer Science Applications
Information Systems and Management
Artificial Intelligence

Access to Document

10.1016/j.ins.2019.01.047

Cite this

@article{4e105df20af74bcbbd0cae603537870d,

title = "Interpretable regularized class association rules algorithm for classification in a categorical data space",

abstract = "Using association rules in classification is a great success which produces high accuracy classifiers. Even so, the principal advantage of the associative classifiers lies in interpretation. However, pruning the useless rules among the huge set of the mined rules as well as combining them to build a classifier remains a subject for improvement and further research. In this paper, we introduce a new algorithm to build a classifier based on Regularized Class Association Rules in a categorical data space called RCAR. The characteristic of this algorithm is, therefore, threefold: First, mining an exhaustive set of Class Association Rules (CARs) according to a predefined values of support and confidence thresholds. Second, applying a regularized logistic regression algorithm with Lasso penalty on the rules space to build a model that predicts the conditional probability of the existence of the outcome. Useless rules are pruned thanks to the selective nature of Lasso regularization. Third, organizing and visualizing the CARs which survive the first step of pruning by Lasso regularization using metarules. An optional step of pruning could be undertaken on the basis of the metarules and subject knowledge. Likewise, the empirical results indicate that RCAR gives comparable accuracy against Random Forest and GBM.",

keywords = "Association rules, Class association rules, Classification, Ensemble learning, Pruning, Regularization",

author = "Mohamed Azmi and Runger, {George C.} and Abdelaziz Berrado",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier Inc.",

year = "2019",

month = may,

doi = "10.1016/j.ins.2019.01.047",

language = "English (US)",

volume = "483",

pages = "313--331",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - Interpretable regularized class association rules algorithm for classification in a categorical data space

AU - Azmi, Mohamed

AU - Runger, George C.

AU - Berrado, Abdelaziz

PY - 2019/5

Y1 - 2019/5

N2 - Using association rules in classification is a great success which produces high accuracy classifiers. Even so, the principal advantage of the associative classifiers lies in interpretation. However, pruning the useless rules among the huge set of the mined rules as well as combining them to build a classifier remains a subject for improvement and further research. In this paper, we introduce a new algorithm to build a classifier based on Regularized Class Association Rules in a categorical data space called RCAR. The characteristic of this algorithm is, therefore, threefold: First, mining an exhaustive set of Class Association Rules (CARs) according to a predefined values of support and confidence thresholds. Second, applying a regularized logistic regression algorithm with Lasso penalty on the rules space to build a model that predicts the conditional probability of the existence of the outcome. Useless rules are pruned thanks to the selective nature of Lasso regularization. Third, organizing and visualizing the CARs which survive the first step of pruning by Lasso regularization using metarules. An optional step of pruning could be undertaken on the basis of the metarules and subject knowledge. Likewise, the empirical results indicate that RCAR gives comparable accuracy against Random Forest and GBM.

AB - Using association rules in classification is a great success which produces high accuracy classifiers. Even so, the principal advantage of the associative classifiers lies in interpretation. However, pruning the useless rules among the huge set of the mined rules as well as combining them to build a classifier remains a subject for improvement and further research. In this paper, we introduce a new algorithm to build a classifier based on Regularized Class Association Rules in a categorical data space called RCAR. The characteristic of this algorithm is, therefore, threefold: First, mining an exhaustive set of Class Association Rules (CARs) according to a predefined values of support and confidence thresholds. Second, applying a regularized logistic regression algorithm with Lasso penalty on the rules space to build a model that predicts the conditional probability of the existence of the outcome. Useless rules are pruned thanks to the selective nature of Lasso regularization. Third, organizing and visualizing the CARs which survive the first step of pruning by Lasso regularization using metarules. An optional step of pruning could be undertaken on the basis of the metarules and subject knowledge. Likewise, the empirical results indicate that RCAR gives comparable accuracy against Random Forest and GBM.

KW - Association rules

KW - Class association rules

KW - Classification

KW - Ensemble learning

KW - Pruning

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=85060333227&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060333227&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2019.01.047

DO - 10.1016/j.ins.2019.01.047

M3 - Article

AN - SCOPUS:85060333227

SN - 0020-0255

VL - 483

SP - 313

EP - 331

JO - Information Sciences

JF - Information Sciences

ER -

Interpretable regularized class association rules algorithm for classification in a categorical data space

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this