Abstract

Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.

Original languageEnglish (US)
Title of host publication26th International World Wide Web Conference, WWW 2017
PublisherInternational World Wide Web Conferences Steering Committee
Pages243-252
Number of pages10
ISBN (Print)9781450349130
DOIs
StatePublished - 2017
Event26th International World Wide Web Conference, WWW 2017 - Perth, Australia
Duration: Apr 3 2017Apr 7 2017

Publication series

Name26th International World Wide Web Conference, WWW 2017

Other

Other26th International World Wide Web Conference, WWW 2017
Country/TerritoryAustralia
CityPerth
Period4/3/174/7/17

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'nTD: Noise-profile adaptive tensor decomposition'. Together they form a unique fingerprint.

Cite this