Abstract

We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms.

Original languageEnglish (US)
Article number7887649
Pages (from-to)9-15
Number of pages7
JournalIEEE Intelligent Systems
Volume32
Issue number2
DOIs
StatePublished - Mar 1 2017

Fingerprint

Feature extraction
Big data

Keywords

  • big data
  • feature selection
  • intelligent systems
  • repository

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Challenges of Feature Selection for Big Data Analytics. / Li, Jundong; Liu, Huan.

In: IEEE Intelligent Systems, Vol. 32, No. 2, 7887649, 01.03.2017, p. 9-15.

Research output: Contribution to journalArticle

@article{bf8739db7d9f4b0e9f7dc6d8c2bdd1dd,
title = "Challenges of Feature Selection for Big Data Analytics",
abstract = "We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms.",
keywords = "big data, feature selection, intelligent systems, repository",
author = "Jundong Li and Huan Liu",
year = "2017",
month = "3",
day = "1",
doi = "10.1109/MIS.2017.38",
language = "English (US)",
volume = "32",
pages = "9--15",
journal = "IEEE Intelligent Systems and Their Applications",
issn = "1541-1672",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - Challenges of Feature Selection for Big Data Analytics

AU - Li, Jundong

AU - Liu, Huan

PY - 2017/3/1

Y1 - 2017/3/1

N2 - We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms.

AB - We're surrounded by huge amounts of large-scale high-dimensional data, but learning tasks require reduced data dimensionality. Feature selection has shown its effectiveness in many applications by building simpler and more comprehensive models, improving learning performance, and preparing clean, understandable data. Some unique characteristics of big data such as data velocity and data variety have presented challenges to the feature selection problem. In this article, the authors envision these challenges for big data analytics. To facilitate and promote feature selection research, they present an open source feature selection repository (scikit-feature) of popular algorithms.

KW - big data

KW - feature selection

KW - intelligent systems

KW - repository

UR - http://www.scopus.com/inward/record.url?scp=85017141803&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85017141803&partnerID=8YFLogxK

U2 - 10.1109/MIS.2017.38

DO - 10.1109/MIS.2017.38

M3 - Article

VL - 32

SP - 9

EP - 15

JO - IEEE Intelligent Systems and Their Applications

JF - IEEE Intelligent Systems and Their Applications

SN - 1541-1672

IS - 2

M1 - 7887649

ER -