Abstract

Tensors are multi-dimensional arrays - consequently, tensor decomposition operations (CP and Tucker) are the bases for many high-dimensional data analysis tasks, from clustering, trend detection, anomaly detection, to correlation analysis in various application domains, including science and engineering1. One key problem with tensor decomposition is its computational complexity and space requirements. Especially, as the relevant data sets get denser, in-memory schemes for tensor decomposition become increasingly ineffective; therefore out-of-core (secondary-memory supported, potentially parallel) computing is necessitated. However, existing techniques do not consider the I/O and network data exchange costs that out-of-core execution of the tensor decomposition operation will incur. In this paper, we note that when this operation is implemented with the help of secondary-memory and/or multiple servers to tackle the memory limitations, we would need intelligent buffer-management and task-scheduling techniques which take into account the cost of bringing the relevant blocks into the buffer to minimize I/O in the system. In this paper, we introduce 2PCP, a two-phase, block-based CP decomposition system with intelligent buffer sensitive task scheduling and buffer management mechanisms. 2PCP aims to reduce I/O costs in the analysis of relatively dense tensors common in scientific and engineering applications. Experiment results compare with current state of art tensor decomposition algorithms and show that our algorithms can significantly reduce the amount of I/O and execution time while maintaining decomposition accuracy.

Original languageEnglish (US)
Title of host publication2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages835-846
Number of pages12
ISBN (Electronic)9781509020195
DOIs
StatePublished - Jun 22 2016
Event32nd IEEE International Conference on Data Engineering, ICDE 2016 - Helsinki, Finland
Duration: May 16 2016May 20 2016

Other

Other32nd IEEE International Conference on Data Engineering, ICDE 2016
CountryFinland
CityHelsinki
Period5/16/165/20/16

Fingerprint

Tensors
Decomposition
Data storage equipment
Scheduling
Costs
Electronic data interchange
Parallel processing systems
Computational complexity
Servers
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Cite this

Li, X., Huang, S., Candan, K., & Sapino, M. L. (2016). 2PCP: Two-phase CP decomposition for billion-scale dense tensors. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016 (pp. 835-846). [7498294] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDE.2016.7498294

2PCP : Two-phase CP decomposition for billion-scale dense tensors. / Li, Xinsheng; Huang, Shengyu; Candan, Kasim; Sapino, Maria Luisa.

2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 835-846 7498294.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, X, Huang, S, Candan, K & Sapino, ML 2016, 2PCP: Two-phase CP decomposition for billion-scale dense tensors. in 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016., 7498294, Institute of Electrical and Electronics Engineers Inc., pp. 835-846, 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, 5/16/16. https://doi.org/10.1109/ICDE.2016.7498294
Li X, Huang S, Candan K, Sapino ML. 2PCP: Two-phase CP decomposition for billion-scale dense tensors. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 835-846. 7498294 https://doi.org/10.1109/ICDE.2016.7498294
Li, Xinsheng ; Huang, Shengyu ; Candan, Kasim ; Sapino, Maria Luisa. / 2PCP : Two-phase CP decomposition for billion-scale dense tensors. 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 835-846
@inproceedings{a31e1c7def514a889ac598e9fdad8ee5,
title = "2PCP: Two-phase CP decomposition for billion-scale dense tensors",
abstract = "Tensors are multi-dimensional arrays - consequently, tensor decomposition operations (CP and Tucker) are the bases for many high-dimensional data analysis tasks, from clustering, trend detection, anomaly detection, to correlation analysis in various application domains, including science and engineering1. One key problem with tensor decomposition is its computational complexity and space requirements. Especially, as the relevant data sets get denser, in-memory schemes for tensor decomposition become increasingly ineffective; therefore out-of-core (secondary-memory supported, potentially parallel) computing is necessitated. However, existing techniques do not consider the I/O and network data exchange costs that out-of-core execution of the tensor decomposition operation will incur. In this paper, we note that when this operation is implemented with the help of secondary-memory and/or multiple servers to tackle the memory limitations, we would need intelligent buffer-management and task-scheduling techniques which take into account the cost of bringing the relevant blocks into the buffer to minimize I/O in the system. In this paper, we introduce 2PCP, a two-phase, block-based CP decomposition system with intelligent buffer sensitive task scheduling and buffer management mechanisms. 2PCP aims to reduce I/O costs in the analysis of relatively dense tensors common in scientific and engineering applications. Experiment results compare with current state of art tensor decomposition algorithms and show that our algorithms can significantly reduce the amount of I/O and execution time while maintaining decomposition accuracy.",
author = "Xinsheng Li and Shengyu Huang and Kasim Candan and Sapino, {Maria Luisa}",
year = "2016",
month = "6",
day = "22",
doi = "10.1109/ICDE.2016.7498294",
language = "English (US)",
pages = "835--846",
booktitle = "2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - 2PCP

T2 - Two-phase CP decomposition for billion-scale dense tensors

AU - Li, Xinsheng

AU - Huang, Shengyu

AU - Candan, Kasim

AU - Sapino, Maria Luisa

PY - 2016/6/22

Y1 - 2016/6/22

N2 - Tensors are multi-dimensional arrays - consequently, tensor decomposition operations (CP and Tucker) are the bases for many high-dimensional data analysis tasks, from clustering, trend detection, anomaly detection, to correlation analysis in various application domains, including science and engineering1. One key problem with tensor decomposition is its computational complexity and space requirements. Especially, as the relevant data sets get denser, in-memory schemes for tensor decomposition become increasingly ineffective; therefore out-of-core (secondary-memory supported, potentially parallel) computing is necessitated. However, existing techniques do not consider the I/O and network data exchange costs that out-of-core execution of the tensor decomposition operation will incur. In this paper, we note that when this operation is implemented with the help of secondary-memory and/or multiple servers to tackle the memory limitations, we would need intelligent buffer-management and task-scheduling techniques which take into account the cost of bringing the relevant blocks into the buffer to minimize I/O in the system. In this paper, we introduce 2PCP, a two-phase, block-based CP decomposition system with intelligent buffer sensitive task scheduling and buffer management mechanisms. 2PCP aims to reduce I/O costs in the analysis of relatively dense tensors common in scientific and engineering applications. Experiment results compare with current state of art tensor decomposition algorithms and show that our algorithms can significantly reduce the amount of I/O and execution time while maintaining decomposition accuracy.

AB - Tensors are multi-dimensional arrays - consequently, tensor decomposition operations (CP and Tucker) are the bases for many high-dimensional data analysis tasks, from clustering, trend detection, anomaly detection, to correlation analysis in various application domains, including science and engineering1. One key problem with tensor decomposition is its computational complexity and space requirements. Especially, as the relevant data sets get denser, in-memory schemes for tensor decomposition become increasingly ineffective; therefore out-of-core (secondary-memory supported, potentially parallel) computing is necessitated. However, existing techniques do not consider the I/O and network data exchange costs that out-of-core execution of the tensor decomposition operation will incur. In this paper, we note that when this operation is implemented with the help of secondary-memory and/or multiple servers to tackle the memory limitations, we would need intelligent buffer-management and task-scheduling techniques which take into account the cost of bringing the relevant blocks into the buffer to minimize I/O in the system. In this paper, we introduce 2PCP, a two-phase, block-based CP decomposition system with intelligent buffer sensitive task scheduling and buffer management mechanisms. 2PCP aims to reduce I/O costs in the analysis of relatively dense tensors common in scientific and engineering applications. Experiment results compare with current state of art tensor decomposition algorithms and show that our algorithms can significantly reduce the amount of I/O and execution time while maintaining decomposition accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84980348012&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84980348012&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2016.7498294

DO - 10.1109/ICDE.2016.7498294

M3 - Conference contribution

SP - 835

EP - 846

BT - 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -