A Review of Distributed Algorithms for Principal Component Analysis

Sissi Xiaoxiao Wu, Hoi To Wai, Lin Li, Anna Scaglione

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

Original languageEnglish (US)
Article number8425655
Pages (from-to)1321-1340
Number of pages20
JournalProceedings of the IEEE
Volume106
Issue number8
DOIs
StatePublished - Aug 1 2018
Externally publishedYes

Fingerprint

Parallel algorithms
Principal component analysis
Data acquisition
Array processing
Learning systems
Communication

Keywords

  • Clustering algorithms
  • data mining
  • distributed algorithms
  • principal component analysis
  • radar signal processing

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

A Review of Distributed Algorithms for Principal Component Analysis. / Wu, Sissi Xiaoxiao; Wai, Hoi To; Li, Lin; Scaglione, Anna.

In: Proceedings of the IEEE, Vol. 106, No. 8, 8425655, 01.08.2018, p. 1321-1340.

Research output: Contribution to journalArticle

Wu, Sissi Xiaoxiao ; Wai, Hoi To ; Li, Lin ; Scaglione, Anna. / A Review of Distributed Algorithms for Principal Component Analysis. In: Proceedings of the IEEE. 2018 ; Vol. 106, No. 8. pp. 1321-1340.
@article{df9f8b3a92b346f1a9228a2574f02431,
title = "A Review of Distributed Algorithms for Principal Component Analysis",
abstract = "Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.",
keywords = "Clustering algorithms, data mining, distributed algorithms, principal component analysis, radar signal processing",
author = "Wu, {Sissi Xiaoxiao} and Wai, {Hoi To} and Lin Li and Anna Scaglione",
year = "2018",
month = "8",
day = "1",
doi = "10.1109/JPROC.2018.2846568",
language = "English (US)",
volume = "106",
pages = "1321--1340",
journal = "Proceedings of the IEEE",
issn = "0018-9219",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "8",

}

TY - JOUR

T1 - A Review of Distributed Algorithms for Principal Component Analysis

AU - Wu, Sissi Xiaoxiao

AU - Wai, Hoi To

AU - Li, Lin

AU - Scaglione, Anna

PY - 2018/8/1

Y1 - 2018/8/1

N2 - Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

AB - Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

KW - Clustering algorithms

KW - data mining

KW - distributed algorithms

KW - principal component analysis

KW - radar signal processing

UR - http://www.scopus.com/inward/record.url?scp=85051204310&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051204310&partnerID=8YFLogxK

U2 - 10.1109/JPROC.2018.2846568

DO - 10.1109/JPROC.2018.2846568

M3 - Article

VL - 106

SP - 1321

EP - 1340

JO - Proceedings of the IEEE

JF - Proceedings of the IEEE

SN - 0018-9219

IS - 8

M1 - 8425655

ER -