GI-NMF: Group incremental non-negative matrix factorization on data streams

Xilun Chen, Kasim Candan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Non-negative matrix factorization (NMF) is a well known method for obtaining low rank approximations of data sets, which can then be used for efficient indexing, classification, and retrieval. The non-negativity constraints enable probabilistic interpretation of the results and discovery of generative models. One key disadvantage of the NMF, however, is that it is costly to obtain and this makes it difficult to apply NMF in applications where data is dynamic. In this paper, we recognize that many applications involve redundancies and we argue that these redundancies can and should be leveraged for reducing the computational cost of the NMF process: Firstly, online applications involving data streams often include temporal redundancies. Secondly, and perhaps less obviously, many applications include integration of multiple data streams (with potential overlaps) and/or involves tracking of multiple similar (but different) queries; this leads to significant data and query redundancies, which if leveraged properly can help alleviate computational cost of NMF. Based on these observations, we introduce Group Incremental Non-Negative Matrix Factorization (GI-NMF) which leverages redundancies across multiple NMF tasks over data streams. The proposed algorithm relies on a novel group multiplicative update rules (G-MUR) method to significantly reduce the cost of NMF. G-MUR is further complemented to support incremental update of the factors where data evolves continuously. Experiments show that GI-NMF significantly reduces the processing time, with minimal error overhead.

Original languageEnglish (US)
Title of host publicationCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages1119-1128
Number of pages10
ISBN (Electronic)9781450325981
DOIs
StatePublished - Nov 3 2014
Event23rd ACM International Conference on Information and Knowledge Management, CIKM 2014 - Shanghai, China
Duration: Nov 3 2014Nov 7 2014

Publication series

NameCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management

Other

Other23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
Country/TerritoryChina
CityShanghai
Period11/3/1411/7/14

ASJC Scopus subject areas

  • Information Systems and Management
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'GI-NMF: Group incremental non-negative matrix factorization on data streams'. Together they form a unique fingerprint.

Cite this