A classification algorithm for high-dimensional data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

With the advent of high-dimensional stored big data and streaming data, suddenly machine learning on a very large scale has become a critical need. Such machine learning should be extremely fast, should scale up easily with volume and dimension, should be able to learn from streaming data, should automatically perform dimension reduction for high-dimensional data, and should be deployable on hardware. Neural networks are well positioned to address these challenges of large scale machine learning. In this paper, we present a method that can effectively handle large scale, high-dimensional data. It is an online method that can be used for both streaming and large volumes of stored big data. It primarily uses Kohonen nets, although only a few selected neurons (nodes) from multiple Kohonen nets are actually retained in the end; we discard all Kohonen nets after training. We use Kohonen nets both for dimensionality reduction through feature selection and for building an ensemble of classifiers using single Kohonen neurons. The method is meant to exploit massive parallelism and should be easily deployable on hardware that implements Kohonen nets. Some initial computational results are presented.

Original languageEnglish (US)
Title of host publicationProcedia Computer Science
PublisherElsevier
Pages345-355
Number of pages11
Volume53
Edition1
DOIs
StatePublished - 2015
EventINNS Conference on Big Data 2015 - San Francisco, United States
Duration: Aug 8 2015Aug 10 2015

Other

OtherINNS Conference on Big Data 2015
Country/TerritoryUnited States
CitySan Francisco
Period8/8/158/10/15

Keywords

  • Classification algorithm
  • Feature selection
  • High-dimensional data
  • Kohonen nets
  • Online learning

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'A classification algorithm for high-dimensional data'. Together they form a unique fingerprint.

Cite this