Collaborative filtering with information-rich and information-sparse entities

Kai Zhu; Rui Wu; Lei Ying; R. Srikant

doi:10.1007/s10994-014-5454-z

Collaborative filtering with information-rich and information-sparse entities

Kai Zhu, Rui Wu, Lei Ying, R. Srikant

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K² entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.

Original language	English (US)
Pages (from-to)	177-203
Number of pages	27
Journal	Machine Learning
Volume	97
Issue number	1-2
DOIs	https://doi.org/10.1007/s10994-014-5454-z
State	Published - Oct 2014

Keywords

Clustering model
Collaborative filtering
Matrix completion
Recommender system

ASJC Scopus subject areas

Software
Artificial Intelligence

Access to Document

10.1007/s10994-014-5454-z

Cite this

@article{a3134ff54ee042d49fcf87d6c4359556,

title = "Collaborative filtering with information-rich and information-sparse entities",

abstract = "In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K2 entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.",

keywords = "Clustering model, Collaborative filtering, Matrix completion, Recommender system",

author = "Kai Zhu and Rui Wu and Lei Ying and R. Srikant",

note = "Funding Information: Acknowledgments Research supported in part by AFOSR grant FA 9550-10-1-0573 and NSF grants ECCS-1202065 and ECCS-1255425. We thank Prof. Olgica Milenkovic for comments on an earlier version of the paper.",

year = "2014",

month = oct,

doi = "10.1007/s10994-014-5454-z",

language = "English (US)",

volume = "97",

pages = "177--203",

journal = "Machine Learning",

issn = "0885-6125",

publisher = "Springer Netherlands",

number = "1-2",

}

TY - JOUR

T1 - Collaborative filtering with information-rich and information-sparse entities

AU - Zhu, Kai

AU - Wu, Rui

AU - Ying, Lei

AU - Srikant, R.

N1 - Funding Information: Acknowledgments Research supported in part by AFOSR grant FA 9550-10-1-0573 and NSF grants ECCS-1202065 and ECCS-1255425. We thank Prof. Olgica Milenkovic for comments on an earlier version of the paper.

PY - 2014/10

Y1 - 2014/10

N2 - In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K2 entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.

AB - In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K2 entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.

KW - Clustering model

KW - Collaborative filtering

KW - Matrix completion

KW - Recommender system

UR - http://www.scopus.com/inward/record.url?scp=84906946511&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906946511&partnerID=8YFLogxK

U2 - 10.1007/s10994-014-5454-z

DO - 10.1007/s10994-014-5454-z

M3 - Article

AN - SCOPUS:84906946511

SN - 0885-6125

VL - 97

SP - 177

EP - 203

JO - Machine Learning

JF - Machine Learning

IS - 1-2

ER -

Collaborative filtering with information-rich and information-sparse entities

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this