TY - JOUR
T1 - A scalable, incremental learning algorithm for classification problems
AU - Ye, Nong
AU - Li, Xiangyang
N1 - Funding Information:
This work is sponsored in part by the Air Force Office of Scientific Research (AFOSR) under grant number F49620-99-1-001. The US government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either express or implied, of, AFOSR or the US Government. We would like to thank Dr Evangelos Triantaphyllou and Dr T. Warren Liao for their kind guidance on improving the quality of this paper. We would also like to thank two reviewers whose comments have helped us improve the quality of the paper.
PY - 2002/9
Y1 - 2002/9
N2 - In this paper a novel data mining algorithm, Clustering and Classification Algorithm-Supervised (CCA-S), is introduced. CCA-S enables the scalable, incremental learning of a non-hierarchical cluster structure from training data. This cluster structure serves as a function to map the attribute values of new data to the target class of these data, that is, classify new data. CCA-S utilizes both the distance and the target class of training data points to derive the cluster structure. In this paper, we first present problems with many existing data mining algorithms for classification problems, such as decision trees, artificial neural networks, in scalable and incremental learning. We then describe CCA-S and discuss its advantages in scalable, incremental learning. The testing results of applying CCA-S to several common data sets for classification problems are presented. The testing results show that the classification performance of CCA-S is comparable to the other data mining algorithms such as decision trees, artificial neural networks and discriminant analysis.
AB - In this paper a novel data mining algorithm, Clustering and Classification Algorithm-Supervised (CCA-S), is introduced. CCA-S enables the scalable, incremental learning of a non-hierarchical cluster structure from training data. This cluster structure serves as a function to map the attribute values of new data to the target class of these data, that is, classify new data. CCA-S utilizes both the distance and the target class of training data points to derive the cluster structure. In this paper, we first present problems with many existing data mining algorithms for classification problems, such as decision trees, artificial neural networks, in scalable and incremental learning. We then describe CCA-S and discuss its advantages in scalable, incremental learning. The testing results of applying CCA-S to several common data sets for classification problems are presented. The testing results show that the classification performance of CCA-S is comparable to the other data mining algorithms such as decision trees, artificial neural networks and discriminant analysis.
KW - Classification
KW - Data mining
KW - Incremental learning
KW - Scalability
UR - http://www.scopus.com/inward/record.url?scp=0036712901&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036712901&partnerID=8YFLogxK
U2 - 10.1016/S0360-8352(02)00132-8
DO - 10.1016/S0360-8352(02)00132-8
M3 - Article
AN - SCOPUS:0036712901
SN - 0360-8352
VL - 43
SP - 677
EP - 692
JO - Computers and Industrial Engineering
JF - Computers and Industrial Engineering
IS - 4
ER -