Abstract

Non-negative matrix factorization (NMF) is a well known method for obtaining low rank approximations of data sets, which can then be used for efficient indexing, classification, and retrieval. The non-negativity constraints enable probabilistic interpretation of the results and discovery of generative models. One key disadvantage of the NMF, however, is that it is costly to obtain and this makes it difficult to apply NMF in applications where data is dynamic. In this paper, we recognize that many applications involve redundancies and we argue that these redundancies can and should be leveraged for reducing the computational cost of the NMF process: Firstly, online applications involving data streams often include temporal redundancies. Secondly, and perhaps less obviously, many applications include integration of multiple data streams (with potential overlaps) and/or involves tracking of multiple similar (but different) queries; this leads to significant data and query redundancies, which if leveraged properly can help alleviate computational cost of NMF. Based on these observations, we introduce Group Incremental Non-Negative Matrix Factorization (GI-NMF) which leverages redundancies across multiple NMF tasks over data streams. The proposed algorithm relies on a novel group multiplicative update rules (G-MUR) method to significantly reduce the cost of NMF. G-MUR is further complemented to support incremental update of the factors where data evolves continuously. Experiments show that GI-NMF significantly reduces the processing time, with minimal error overhead.

Original languageEnglish (US)
Title of host publicationCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery, Inc
Pages1119-1128
Number of pages10
ISBN (Print)9781450325981
DOIs
StatePublished - Nov 3 2014
Event23rd ACM International Conference on Information and Knowledge Management, CIKM 2014 - Shanghai, China
Duration: Nov 3 2014Nov 7 2014

Other

Other23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
CountryChina
CityShanghai
Period11/3/1411/7/14

Fingerprint

Factorization
Redundancy
Matrix factorization
Incremental
Data streams
Costs
Processing

ASJC Scopus subject areas

  • Information Systems and Management
  • Computer Science Applications
  • Information Systems

Cite this

Chen, X., & Candan, K. (2014). GI-NMF: Group incremental non-negative matrix factorization on data streams. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management (pp. 1119-1128). Association for Computing Machinery, Inc. https://doi.org/10.1145/2661829.2662008

GI-NMF : Group incremental non-negative matrix factorization on data streams. / Chen, Xilun; Candan, Kasim.

CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2014. p. 1119-1128.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chen, X & Candan, K 2014, GI-NMF: Group incremental non-negative matrix factorization on data streams. in CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, pp. 1119-1128, 23rd ACM International Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, 11/3/14. https://doi.org/10.1145/2661829.2662008
Chen X, Candan K. GI-NMF: Group incremental non-negative matrix factorization on data streams. In CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc. 2014. p. 1119-1128 https://doi.org/10.1145/2661829.2662008
Chen, Xilun ; Candan, Kasim. / GI-NMF : Group incremental non-negative matrix factorization on data streams. CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, Inc, 2014. pp. 1119-1128
@inproceedings{cc55414d3cd9495c957e1016ef47aa47,
title = "GI-NMF: Group incremental non-negative matrix factorization on data streams",
abstract = "Non-negative matrix factorization (NMF) is a well known method for obtaining low rank approximations of data sets, which can then be used for efficient indexing, classification, and retrieval. The non-negativity constraints enable probabilistic interpretation of the results and discovery of generative models. One key disadvantage of the NMF, however, is that it is costly to obtain and this makes it difficult to apply NMF in applications where data is dynamic. In this paper, we recognize that many applications involve redundancies and we argue that these redundancies can and should be leveraged for reducing the computational cost of the NMF process: Firstly, online applications involving data streams often include temporal redundancies. Secondly, and perhaps less obviously, many applications include integration of multiple data streams (with potential overlaps) and/or involves tracking of multiple similar (but different) queries; this leads to significant data and query redundancies, which if leveraged properly can help alleviate computational cost of NMF. Based on these observations, we introduce Group Incremental Non-Negative Matrix Factorization (GI-NMF) which leverages redundancies across multiple NMF tasks over data streams. The proposed algorithm relies on a novel group multiplicative update rules (G-MUR) method to significantly reduce the cost of NMF. G-MUR is further complemented to support incremental update of the factors where data evolves continuously. Experiments show that GI-NMF significantly reduces the processing time, with minimal error overhead.",
author = "Xilun Chen and Kasim Candan",
year = "2014",
month = "11",
day = "3",
doi = "10.1145/2661829.2662008",
language = "English (US)",
isbn = "9781450325981",
pages = "1119--1128",
booktitle = "CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - GI-NMF

T2 - Group incremental non-negative matrix factorization on data streams

AU - Chen, Xilun

AU - Candan, Kasim

PY - 2014/11/3

Y1 - 2014/11/3

N2 - Non-negative matrix factorization (NMF) is a well known method for obtaining low rank approximations of data sets, which can then be used for efficient indexing, classification, and retrieval. The non-negativity constraints enable probabilistic interpretation of the results and discovery of generative models. One key disadvantage of the NMF, however, is that it is costly to obtain and this makes it difficult to apply NMF in applications where data is dynamic. In this paper, we recognize that many applications involve redundancies and we argue that these redundancies can and should be leveraged for reducing the computational cost of the NMF process: Firstly, online applications involving data streams often include temporal redundancies. Secondly, and perhaps less obviously, many applications include integration of multiple data streams (with potential overlaps) and/or involves tracking of multiple similar (but different) queries; this leads to significant data and query redundancies, which if leveraged properly can help alleviate computational cost of NMF. Based on these observations, we introduce Group Incremental Non-Negative Matrix Factorization (GI-NMF) which leverages redundancies across multiple NMF tasks over data streams. The proposed algorithm relies on a novel group multiplicative update rules (G-MUR) method to significantly reduce the cost of NMF. G-MUR is further complemented to support incremental update of the factors where data evolves continuously. Experiments show that GI-NMF significantly reduces the processing time, with minimal error overhead.

AB - Non-negative matrix factorization (NMF) is a well known method for obtaining low rank approximations of data sets, which can then be used for efficient indexing, classification, and retrieval. The non-negativity constraints enable probabilistic interpretation of the results and discovery of generative models. One key disadvantage of the NMF, however, is that it is costly to obtain and this makes it difficult to apply NMF in applications where data is dynamic. In this paper, we recognize that many applications involve redundancies and we argue that these redundancies can and should be leveraged for reducing the computational cost of the NMF process: Firstly, online applications involving data streams often include temporal redundancies. Secondly, and perhaps less obviously, many applications include integration of multiple data streams (with potential overlaps) and/or involves tracking of multiple similar (but different) queries; this leads to significant data and query redundancies, which if leveraged properly can help alleviate computational cost of NMF. Based on these observations, we introduce Group Incremental Non-Negative Matrix Factorization (GI-NMF) which leverages redundancies across multiple NMF tasks over data streams. The proposed algorithm relies on a novel group multiplicative update rules (G-MUR) method to significantly reduce the cost of NMF. G-MUR is further complemented to support incremental update of the factors where data evolves continuously. Experiments show that GI-NMF significantly reduces the processing time, with minimal error overhead.

UR - http://www.scopus.com/inward/record.url?scp=84937566005&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937566005&partnerID=8YFLogxK

U2 - 10.1145/2661829.2662008

DO - 10.1145/2661829.2662008

M3 - Conference contribution

SN - 9781450325981

SP - 1119

EP - 1128

BT - CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management

PB - Association for Computing Machinery, Inc

ER -