TY - GEN
T1 - It's who you know
T2 - 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011
AU - Henderson, Keith
AU - Gallagher, Brian
AU - Li, Lei
AU - Akoglu, Leman
AU - Eliassi-Rad, Tina
AU - Tong, Hanghang
AU - Faloutsos, Christos
PY - 2011
Y1 - 2011
N2 - Given a graph, how can we extract good features for the nodes? For example, given two large graphs from the same domain, how can we use information in one to do classification in the other (i.e., perform across-network classification or transfer learning on graphs)? Also, if one of the graphs is anonymized, how can we use information in one to de-anonymize the other? The key step in all such graph mining tasks is to find effective node features. We propose ReFeX (Recursive Feature eXtraction), a novel algorithm, that recursively combines local (node-based) features with neighborhood (egonet-based) features; and outputs regional features - capturing "behavioral" information. We demonstrate how these powerful regional features can be used in within-network and across-network classification and de-anonymization tasks - without relying on homophily, or the availability of class labels. The contributions of our work are as follows: (a) ReFeX is scalable and (b) it is effective, capturing regional ("behavioral") information in large graphs. We report experiments on real graphs from various domains with over 1M edges, where ReFeX outperforms its competitors on typical graph mining tasks like network classification and de-anonymization.
AB - Given a graph, how can we extract good features for the nodes? For example, given two large graphs from the same domain, how can we use information in one to do classification in the other (i.e., perform across-network classification or transfer learning on graphs)? Also, if one of the graphs is anonymized, how can we use information in one to de-anonymize the other? The key step in all such graph mining tasks is to find effective node features. We propose ReFeX (Recursive Feature eXtraction), a novel algorithm, that recursively combines local (node-based) features with neighborhood (egonet-based) features; and outputs regional features - capturing "behavioral" information. We demonstrate how these powerful regional features can be used in within-network and across-network classification and de-anonymization tasks - without relying on homophily, or the availability of class labels. The contributions of our work are as follows: (a) ReFeX is scalable and (b) it is effective, capturing regional ("behavioral") information in large graphs. We report experiments on real graphs from various domains with over 1M edges, where ReFeX outperforms its competitors on typical graph mining tasks like network classification and de-anonymization.
KW - Feature extraction
KW - Graph mining
KW - Identity resolution
KW - Network classification
UR - http://www.scopus.com/inward/record.url?scp=80052674771&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052674771&partnerID=8YFLogxK
U2 - 10.1145/2020408.2020512
DO - 10.1145/2020408.2020512
M3 - Conference contribution
AN - SCOPUS:80052674771
SN - 9781450308137
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 663
EP - 671
BT - Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD'11
PB - Association for Computing Machinery
Y2 - 21 August 2011 through 24 August 2011
ER -