Abstract

For many multi-dimensional data applications, tensor operations as well as relational operations both need to be supported throughout the data lifecycle. Tensor based representations (including two widely used tensor decompositions, CP and Tucker decompositions) are proven to be effective in multi-aspect data analysis and tensor decomposition is an important tool for capturing high-order structures in multi-dimensional data. Although tensor decomposition is shown to be effective for multi-dimensional data analysis, the cost of tensor decomposition is often very high. Since the number of modes of the tensor data is one of the main factors contributing to the costs of the tensor operations, in this paper, we focus on reducing the modality of the input tensors to tackle the computational cost of the tensor decomposition process. We propose a novel decomposition-by-normalization scheme that first normalizes the given relation into smaller tensors based on the functional dependencies of the relation, decomposes these smaller tensors, and then recombines the sub-results to obtain the overall decomposition. The decomposition and recombination steps of the decomposition-by-normalization scheme fit naturally in settings with multiple cores. This leads to a highly efficient, effective, and parallelized decomposition-by-normalization algorithm for both dense and sparse tensors for CP and Tucker decompositions. Experimental results confirm the efficiency and effectiveness of the proposed decomposition-by-normalization scheme compared to the conventional nonnegative CP decomposition and Tucker decomposition approaches.

Original languageEnglish (US)
Pages (from-to)1-46
Number of pages46
JournalData Mining and Knowledge Discovery
Volume30
Issue number1
DOIs
StatePublished - Jan 1 2016

Fingerprint

Tensors
Decomposition
Costs

Keywords

  • CP decomposition
  • Parallel CP decomposition
  • Parallel tensor decomposition
  • Parallel Tucker decomposition
  • Tensor decomposition
  • Tucker decomposition

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications

Cite this

@article{747ed58a70c041afa2caa656fa58a62c,
title = "Decomposition-by-normalization (DBN): leveraging approximate functional dependencies for efficient CP and tucker decompositions",
abstract = "For many multi-dimensional data applications, tensor operations as well as relational operations both need to be supported throughout the data lifecycle. Tensor based representations (including two widely used tensor decompositions, CP and Tucker decompositions) are proven to be effective in multi-aspect data analysis and tensor decomposition is an important tool for capturing high-order structures in multi-dimensional data. Although tensor decomposition is shown to be effective for multi-dimensional data analysis, the cost of tensor decomposition is often very high. Since the number of modes of the tensor data is one of the main factors contributing to the costs of the tensor operations, in this paper, we focus on reducing the modality of the input tensors to tackle the computational cost of the tensor decomposition process. We propose a novel decomposition-by-normalization scheme that first normalizes the given relation into smaller tensors based on the functional dependencies of the relation, decomposes these smaller tensors, and then recombines the sub-results to obtain the overall decomposition. The decomposition and recombination steps of the decomposition-by-normalization scheme fit naturally in settings with multiple cores. This leads to a highly efficient, effective, and parallelized decomposition-by-normalization algorithm for both dense and sparse tensors for CP and Tucker decompositions. Experimental results confirm the efficiency and effectiveness of the proposed decomposition-by-normalization scheme compared to the conventional nonnegative CP decomposition and Tucker decomposition approaches.",
keywords = "CP decomposition, Parallel CP decomposition, Parallel tensor decomposition, Parallel Tucker decomposition, Tensor decomposition, Tucker decomposition",
author = "Mijung Kim and Kasim Candan",
year = "2016",
month = "1",
day = "1",
doi = "10.1007/s10618-015-0401-6",
language = "English (US)",
volume = "30",
pages = "1--46",
journal = "Data Mining and Knowledge Discovery",
issn = "1384-5810",
publisher = "Springer Netherlands",
number = "1",

}

TY - JOUR

T1 - Decomposition-by-normalization (DBN)

T2 - leveraging approximate functional dependencies for efficient CP and tucker decompositions

AU - Kim, Mijung

AU - Candan, Kasim

PY - 2016/1/1

Y1 - 2016/1/1

N2 - For many multi-dimensional data applications, tensor operations as well as relational operations both need to be supported throughout the data lifecycle. Tensor based representations (including two widely used tensor decompositions, CP and Tucker decompositions) are proven to be effective in multi-aspect data analysis and tensor decomposition is an important tool for capturing high-order structures in multi-dimensional data. Although tensor decomposition is shown to be effective for multi-dimensional data analysis, the cost of tensor decomposition is often very high. Since the number of modes of the tensor data is one of the main factors contributing to the costs of the tensor operations, in this paper, we focus on reducing the modality of the input tensors to tackle the computational cost of the tensor decomposition process. We propose a novel decomposition-by-normalization scheme that first normalizes the given relation into smaller tensors based on the functional dependencies of the relation, decomposes these smaller tensors, and then recombines the sub-results to obtain the overall decomposition. The decomposition and recombination steps of the decomposition-by-normalization scheme fit naturally in settings with multiple cores. This leads to a highly efficient, effective, and parallelized decomposition-by-normalization algorithm for both dense and sparse tensors for CP and Tucker decompositions. Experimental results confirm the efficiency and effectiveness of the proposed decomposition-by-normalization scheme compared to the conventional nonnegative CP decomposition and Tucker decomposition approaches.

AB - For many multi-dimensional data applications, tensor operations as well as relational operations both need to be supported throughout the data lifecycle. Tensor based representations (including two widely used tensor decompositions, CP and Tucker decompositions) are proven to be effective in multi-aspect data analysis and tensor decomposition is an important tool for capturing high-order structures in multi-dimensional data. Although tensor decomposition is shown to be effective for multi-dimensional data analysis, the cost of tensor decomposition is often very high. Since the number of modes of the tensor data is one of the main factors contributing to the costs of the tensor operations, in this paper, we focus on reducing the modality of the input tensors to tackle the computational cost of the tensor decomposition process. We propose a novel decomposition-by-normalization scheme that first normalizes the given relation into smaller tensors based on the functional dependencies of the relation, decomposes these smaller tensors, and then recombines the sub-results to obtain the overall decomposition. The decomposition and recombination steps of the decomposition-by-normalization scheme fit naturally in settings with multiple cores. This leads to a highly efficient, effective, and parallelized decomposition-by-normalization algorithm for both dense and sparse tensors for CP and Tucker decompositions. Experimental results confirm the efficiency and effectiveness of the proposed decomposition-by-normalization scheme compared to the conventional nonnegative CP decomposition and Tucker decomposition approaches.

KW - CP decomposition

KW - Parallel CP decomposition

KW - Parallel tensor decomposition

KW - Parallel Tucker decomposition

KW - Tensor decomposition

KW - Tucker decomposition

UR - http://www.scopus.com/inward/record.url?scp=84953839186&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84953839186&partnerID=8YFLogxK

U2 - 10.1007/s10618-015-0401-6

DO - 10.1007/s10618-015-0401-6

M3 - Article

VL - 30

SP - 1

EP - 46

JO - Data Mining and Knowledge Discovery

JF - Data Mining and Knowledge Discovery

SN - 1384-5810

IS - 1

ER -