TY - JOUR
T1 - Graph based anomaly detection and description
T2 - A survey
AU - Akoglu, Leman
AU - Tong, Hanghang
AU - Koutra, Danai
N1 - Funding Information:
This material is based upon work supported by the Army Research Office (ARO) under Cooperative Agreement Numbers W911NF-14-1-0029 and W911NF-09-2-0053, the Defense Advanced Research Projects Agency (DARPA) under Contract Numbers W911NF-11-C-0088, W911NF-11-C-0200 and W911NF-12-C-0028, the National Science Foundation (NSF) under Grant Nos. IIS-1217559 and IIS1017415, by Region II University Transportation Center under the Project number 49997-33-25, and the Stony Brook University Office of Vice President for Research. Any findings and conclusions expressed in this material are those of the author(s) and do not necessarily reflect the position or the policy of the U.S. Government and the other funding parties, and no official endorsement should be inferred. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation here on.
Publisher Copyright:
© 2014, The Author(s).
PY - 2015/4/10
Y1 - 2015/4/10
N2 - Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised versus (semi-)supervised approaches, for static versus dynamic graphs, for attributed versus plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.
AB - Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised versus (semi-)supervised approaches, for static versus dynamic graphs, for attributed versus plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field.
KW - Anomaly description
KW - Anomaly detection
KW - Change point detection
KW - Event detection
KW - Fraud detection
KW - Graph mining
KW - Network anomaly detection
KW - Visual analytics
UR - http://www.scopus.com/inward/record.url?scp=84940282157&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84940282157&partnerID=8YFLogxK
U2 - 10.1007/s10618-014-0365-y
DO - 10.1007/s10618-014-0365-y
M3 - Article
AN - SCOPUS:84940282157
SN - 1384-5810
VL - 29
SP - 626
EP - 688
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 3
ER -