Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs

Jiaming Xu; Rui Wu; Kai Zhu; Bruce Hajek; R. Srikant; Lei Ying

doi:10.1145/2591971.2592005

Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs

Jiaming Xu, Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, Lei Ying

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

15 Scopus citations

Abstract

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade-offs: one can gradually reduce the computational complexity when increasingly more observations are available..

Original language	English (US)
Title of host publication	SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems
Publisher	Association for Computing Machinery
Pages	29-41
Number of pages	13
ISBN (Print)	9781450327893
DOIs	https://doi.org/10.1145/2591971.2592005
State	Published - 2014
Event	2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014 - Austin, TX, United States Duration: Jun 16 2014 → Jun 20 2014

Publication series

Name	SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

Conference

Conference	2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014
Country/Territory	United States
City	Austin, TX
Period	6/16/14 → 6/20/14

Keywords

Clustering
Low-Rank Matrix Recovery
Spectral Method

ASJC Scopus subject areas

Computer Graphics and Computer-Aided Design
Modeling and Simulation

Access to Document

10.1145/2591971.2592005

Cite this

Xu, J., Wu, R., Zhu, K., Hajek, B., Srikant, R., & Ying, L. (2014). Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs. In SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (pp. 29-41). (SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems). Association for Computing Machinery. https://doi.org/10.1145/2591971.2592005

Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs. / Xu, Jiaming; Wu, Rui; Zhu, Kai et al.
SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. Association for Computing Machinery, 2014. p. 29-41 (SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Xu, J, Wu, R, Zhu, K, Hajek, B, Srikant, R & Ying, L 2014, Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs. in SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, Association for Computing Machinery, pp. 29-41, 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014, Austin, TX, United States, 6/16/14. https://doi.org/10.1145/2591971.2592005

Xu J, Wu R, Zhu K, Hajek B, Srikant R, Ying L. Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs. In SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. Association for Computing Machinery. 2014. p. 29-41. (SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems). doi: 10.1145/2591971.2592005

Xu, Jiaming ; Wu, Rui ; Zhu, Kai et al. / Jointly clustering rows and columns of binary matrices : Algorithms and trade-offs. SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. Association for Computing Machinery, 2014. pp. 29-41 (SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems).

@inproceedings{051039b4576246ef9dffe10883711930,

title = "Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs",

abstract = "In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade-offs: one can gradually reduce the computational complexity when increasingly more observations are available..",

keywords = "Clustering, Low-Rank Matrix Recovery, Spectral Method",

author = "Jiaming Xu and Rui Wu and Kai Zhu and Bruce Hajek and R. Srikant and Lei Ying",

year = "2014",

doi = "10.1145/2591971.2592005",

language = "English (US)",

isbn = "9781450327893",

series = "SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems",

publisher = "Association for Computing Machinery",

pages = "29--41",

booktitle = "SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems",

note = "2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014 ; Conference date: 16-06-2014 Through 20-06-2014",

}

TY - GEN

T1 - Jointly clustering rows and columns of binary matrices

T2 - 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 2014

AU - Xu, Jiaming

AU - Wu, Rui

AU - Zhu, Kai

AU - Hajek, Bruce

AU - Srikant, R.

AU - Ying, Lei

PY - 2014

Y1 - 2014

N2 - In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade-offs: one can gradually reduce the computational complexity when increasingly more observations are available..

AB - In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure. In this paper, we consider a class of binary matrices, arising in many applications, which exhibit both row and column cluster structure, and our goal is to exactly recover the underlying row and column clusters by observing only a small fraction of noisy entries. We first derive a lower bound on the minimum number of observations needed for exact cluster recovery. Then, we study three algorithms with different running time and compare the number of observations needed by them for successful cluster recovery. Our analytical results show smooth time-data trade-offs: one can gradually reduce the computational complexity when increasingly more observations are available..

KW - Clustering

KW - Low-Rank Matrix Recovery

KW - Spectral Method

UR - http://www.scopus.com/inward/record.url?scp=84904346897&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84904346897&partnerID=8YFLogxK

U2 - 10.1145/2591971.2592005

DO - 10.1145/2591971.2592005

M3 - Conference contribution

AN - SCOPUS:84955607655

SN - 9781450327893

T3 - SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

SP - 29

EP - 41

BT - SIGMETRICS 2014 - Proceedings of the 2014 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems

PB - Association for Computing Machinery

Y2 - 16 June 2014 through 20 June 2014

ER -

Jointly clustering rows and columns of binary matrices: Algorithms and trade-offs

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this