TY - GEN
T1 - nTD
T2 - 26th International World Wide Web Conference, WWW 2017
AU - Li, Xinsheng
AU - Candan, Kasim
AU - Sapino, Maria Luisa
N1 - Publisher Copyright:
© 2017 International World Wide Web Conference Committee (IW3C2).
PY - 2017
Y1 - 2017
N2 - Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.
AB - Tensor decomposition is used for many web and user data analysis operations from clustering, trend detection, anomaly detection, to correlation analysis. However, many of the tensor decomposition schemes are sensitive to noisy data, an inevitable problem in the real world that can lead to false conclusions. The problem is compounded by overfitting when the user data is sparse. Recent research has shown that it is possible to avoid over-fitting by relying on probabilistic techniques. However, these have two major deficiencies: (a) firstly, they assume that all the data and intermediary results can fit in the main memory, and (b) they treat the entire tensor uniformly, ignoring potential non-uniformities in the noise distribution. In this paper, we propose a Noise-Profile Adaptive Tensor Decomposition (nTD) method, which aims to tackle both of these challenges. In particular, nTD leverages a grid-based two-phase decomposition strategy for two complementary purposes: firstly, the grid partitioning helps ensure that the memory footprint of the decomposition is kept low; secondly (and perhaps more importantly) any a priori knowledge about the noise profiles of the grid partitions enable us to develop a sample assignment strategy (or s-strategy) that best suits the noise distribution of the given tensor. Experiments show that nTD’s performance is significantly better than conventional CP decomposition techniques on noisy user data tensors.
UR - http://www.scopus.com/inward/record.url?scp=85051512640&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051512640&partnerID=8YFLogxK
U2 - 10.1145/3038912.3052641
DO - 10.1145/3038912.3052641
M3 - Conference contribution
AN - SCOPUS:85051512640
SN - 9781450349130
T3 - 26th International World Wide Web Conference, WWW 2017
SP - 243
EP - 252
BT - 26th International World Wide Web Conference, WWW 2017
PB - International World Wide Web Conferences Steering Committee
Y2 - 3 April 2017 through 7 April 2017
ER -