A Review of Distributed Algorithms for Principal Component Analysis

Sissi Xiaoxiao Wu; Hoi To Wai; Lin Li; Anna Scaglione

doi:10.1109/JPROC.2018.2846568

A Review of Distributed Algorithms for Principal Component Analysis

Sissi Xiaoxiao Wu, Hoi To Wai, Lin Li, Anna Scaglione

Electrical, Computer, and Energy Engineering, School of (IAFSE-ECEE)

Research output: Contribution to journal › Article › peer-review

75 Scopus citations

Abstract

Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

Original language	English (US)
Article number	8425655
Pages (from-to)	1321-1340
Number of pages	20
Journal	Proceedings of the IEEE
Volume	106
Issue number	8
DOIs	https://doi.org/10.1109/JPROC.2018.2846568
State	Published - Aug 2018

Keywords

Clustering algorithms
data mining
distributed algorithms
principal component analysis
radar signal processing

ASJC Scopus subject areas

General Computer Science
Electrical and Electronic Engineering

Access to Document

10.1109/JPROC.2018.2846568

Cite this

@article{df9f8b3a92b346f1a9228a2574f02431,

title = "A Review of Distributed Algorithms for Principal Component Analysis",

abstract = "Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.",

keywords = "Clustering algorithms, data mining, distributed algorithms, principal component analysis, radar signal processing",

author = "Wu, {Sissi Xiaoxiao} and Wai, {Hoi To} and Lin Li and Anna Scaglione",

note = "Funding Information: Manuscript received February 24, 2018; revised May 27, 2018; accepted June 4, 2018. Date of current version August 2, 2018. This work was supported in part by the U.S. Air Force under Contracts FA8721-05-C-0002 and/or FA8702-15-D-0001; by the National Natural Science Foundation of China under Grant 61701315; by Shenzhen Technology R&D Fund JCYJ20170817101149906 and JCYJ20170302145906843; by Shenzhen University Launch Fund 2018018; and by the U.S. National Science Foundation under Grants EAGER CCF 1553746, NSF CCF-BSF 1714672, and BSF 2016660. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U.S. Government. (Corresponding author: Sissi Xiaoxiao Wu.) S. X. Wu is with the Department of Communication and Information Engineering, Shenzhen University, China (e-mail: xxwu.eesissi@szu.edu.cn). H.-T. Wai and A. Scaglione are with the Ira A. Fulton School of Electrical Computer and Energy Engineering, Arizona State University, USA (e-mail: htwai.Scaglione@asu.edu; Anna.Scaglione@asu.edu). L. Li is with the Massachusetts Institute of Technology Lincoln Laboratory, USA (e-mail: lin.li@ll.mit.edu). Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2018",

month = aug,

doi = "10.1109/JPROC.2018.2846568",

language = "English (US)",

volume = "106",

pages = "1321--1340",

journal = "Proceedings of the IEEE",

issn = "0018-9219",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "8",

}

TY - JOUR

T1 - A Review of Distributed Algorithms for Principal Component Analysis

AU - Wu, Sissi Xiaoxiao

AU - Wai, Hoi To

AU - Li, Lin

AU - Scaglione, Anna

N1 - Funding Information: Manuscript received February 24, 2018; revised May 27, 2018; accepted June 4, 2018. Date of current version August 2, 2018. This work was supported in part by the U.S. Air Force under Contracts FA8721-05-C-0002 and/or FA8702-15-D-0001; by the National Natural Science Foundation of China under Grant 61701315; by Shenzhen Technology R&D Fund JCYJ20170817101149906 and JCYJ20170302145906843; by Shenzhen University Launch Fund 2018018; and by the U.S. National Science Foundation under Grants EAGER CCF 1553746, NSF CCF-BSF 1714672, and BSF 2016660. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U.S. Government. (Corresponding author: Sissi Xiaoxiao Wu.) S. X. Wu is with the Department of Communication and Information Engineering, Shenzhen University, China (e-mail: xxwu.eesissi@szu.edu.cn). H.-T. Wai and A. Scaglione are with the Ira A. Fulton School of Electrical Computer and Energy Engineering, Arizona State University, USA (e-mail: htwai.Scaglione@asu.edu; Anna.Scaglione@asu.edu). L. Li is with the Massachusetts Institute of Technology Lincoln Laboratory, USA (e-mail: lin.li@ll.mit.edu). Publisher Copyright: © 1963-2012 IEEE.

PY - 2018/8

Y1 - 2018/8

N2 - Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

AB - Principal component analysis (PCA) is a fundamental primitive of many data analysis, array processing, and machine learning methods. In applications where extremely large arrays of data are involved, particularly in distributed data acquisition systems, distributed PCA algorithms can harness local communications and network connectivity to overcome the need of communicating and accessing the entire array locally. A key feature of distributed PCA algorithm is that they defy the conventional notion that the first step toward computing the principal vectors is to form a sample covariance. This paper is a survey of the methodologies to perform distributed PCA on different data sets, their performance, and of their applications in the context of distributed data acquisition systems.

KW - Clustering algorithms

KW - data mining

KW - distributed algorithms

KW - principal component analysis

KW - radar signal processing

UR - http://www.scopus.com/inward/record.url?scp=85051204310&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051204310&partnerID=8YFLogxK

U2 - 10.1109/JPROC.2018.2846568

DO - 10.1109/JPROC.2018.2846568

M3 - Article

AN - SCOPUS:85051204310

SN - 0018-9219

VL - 106

SP - 1321

EP - 1340

JO - Proceedings of the IEEE

JF - Proceedings of the IEEE

IS - 8

M1 - 8425655

ER -

A Review of Distributed Algorithms for Principal Component Analysis

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this