Abstract

Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.

Original languageEnglish (US)
Title of host publication26th International World Wide Web Conference, WWW 2017
PublisherInternational World Wide Web Conferences Steering Committee
Pages243-252
Number of pages10
ISBN (Print)9781450349147
DOIs
StatePublished - Jan 1 2017
Event26th International World Wide Web Conference, WWW 2017 - Perth, Australia
Duration: Apr 3 2017Apr 7 2017

Other

Other26th International World Wide Web Conference, WWW 2017
CountryAustralia
CityPerth
Period4/3/174/7/17

Fingerprint

Tensors
Decomposition
Data storage equipment
Experiments

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Cite this

Li, X., Candan, K., & Sapino, M. L. (2017). nTD: Noise-profile adaptive tensor decomposition. In 26th International World Wide Web Conference, WWW 2017 (pp. 243-252). [3052641] International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/3038912.3052641

nTD : Noise-profile adaptive tensor decomposition. / Li, Xinsheng; Candan, Kasim; Sapino, Maria Luisa.

26th International World Wide Web Conference, WWW 2017. International World Wide Web Conferences Steering Committee, 2017. p. 243-252 3052641.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, X, Candan, K & Sapino, ML 2017, nTD: Noise-profile adaptive tensor decomposition. in 26th International World Wide Web Conference, WWW 2017., 3052641, International World Wide Web Conferences Steering Committee, pp. 243-252, 26th International World Wide Web Conference, WWW 2017, Perth, Australia, 4/3/17. https://doi.org/10.1145/3038912.3052641
Li X, Candan K, Sapino ML. nTD: Noise-profile adaptive tensor decomposition. In 26th International World Wide Web Conference, WWW 2017. International World Wide Web Conferences Steering Committee. 2017. p. 243-252. 3052641 https://doi.org/10.1145/3038912.3052641
Li, Xinsheng ; Candan, Kasim ; Sapino, Maria Luisa. / nTD : Noise-profile adaptive tensor decomposition. 26th International World Wide Web Conference, WWW 2017. International World Wide Web Conferences Steering Committee, 2017. pp. 243-252
@inproceedings{6dceef28b8fc40cba499094eee148068,
title = "nTD: Noise-profile adaptive tensor decomposition",
abstract = "Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.",
author = "Xinsheng Li and Kasim Candan and Sapino, {Maria Luisa}",
year = "2017",
month = "1",
day = "1",
doi = "10.1145/3038912.3052641",
language = "English (US)",
isbn = "9781450349147",
pages = "243--252",
booktitle = "26th International World Wide Web Conference, WWW 2017",
publisher = "International World Wide Web Conferences Steering Committee",

}

TY - GEN

T1 - nTD

T2 - Noise-profile adaptive tensor decomposition

AU - Li, Xinsheng

AU - Candan, Kasim

AU - Sapino, Maria Luisa

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.

AB - Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.

UR - http://www.scopus.com/inward/record.url?scp=85051512640&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051512640&partnerID=8YFLogxK

U2 - 10.1145/3038912.3052641

DO - 10.1145/3038912.3052641

M3 - Conference contribution

SN - 9781450349147

SP - 243

EP - 252

BT - 26th International World Wide Web Conference, WWW 2017

PB - International World Wide Web Conferences Steering Committee

ER -