TY - GEN
T1 - Frequent instruction sequential pattern mining in hardware sample data
AU - Zou, Jia
AU - Xiao, Jing
AU - Hou, Rui
AU - Wang, Yanqi
PY - 2010/12/1
Y1 - 2010/12/1
N2 - When parallelism and heterogeneity has become the trend for computer system design, both the size and the complexity of the hardware sample data generated by Performance Monitoring Unit (PMU) increase fast, thus automatic analysis methods, i.e. data mining methods, are urgently needed to accelerate hardware sample data analysis. We are the first to study instruction sequential pattern mining for hardware sample data. It is a challenging task due to the implicit sequential relationship contained in the data and due to the importance of low frequency patterns. As a solution, we i) provide a novel algorithm Prof Span; ii) adapt two existing algorithms, which are based on candidate generation and projected database generation, to hardware sample data. Our evaluation results show Prof Span can reduce up to 75% and 80% of execution time compared with other two algorithms. Particularly, up to 50% of frequent patterns mined by Prof Span in hardware sample data are crossing basic block boundaries and can not be found by existing methods for source code or disassembly code. We also analyze three example patterns identified by Prof Span: consecutive loads, JIT entry sequence, and conditional code dependency sequence, to illustrate how Prof Span can benefit performance analysis. Finally, we apply patterns to module classification and obtain promising results.
AB - When parallelism and heterogeneity has become the trend for computer system design, both the size and the complexity of the hardware sample data generated by Performance Monitoring Unit (PMU) increase fast, thus automatic analysis methods, i.e. data mining methods, are urgently needed to accelerate hardware sample data analysis. We are the first to study instruction sequential pattern mining for hardware sample data. It is a challenging task due to the implicit sequential relationship contained in the data and due to the importance of low frequency patterns. As a solution, we i) provide a novel algorithm Prof Span; ii) adapt two existing algorithms, which are based on candidate generation and projected database generation, to hardware sample data. Our evaluation results show Prof Span can reduce up to 75% and 80% of execution time compared with other two algorithms. Particularly, up to 50% of frequent patterns mined by Prof Span in hardware sample data are crossing basic block boundaries and can not be found by existing methods for source code or disassembly code. We also analyze three example patterns identified by Prof Span: consecutive loads, JIT entry sequence, and conditional code dependency sequence, to illustrate how Prof Span can benefit performance analysis. Finally, we apply patterns to module classification and obtain promising results.
KW - Hardware sample data
KW - Performance analysis
KW - Sequential patter mining
UR - http://www.scopus.com/inward/record.url?scp=79951738836&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79951738836&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2010.123
DO - 10.1109/ICDM.2010.123
M3 - Conference contribution
AN - SCOPUS:79951738836
SN - 9780769542560
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1205
EP - 1210
BT - Proceedings - 10th IEEE International Conference on Data Mining, ICDM 2010
T2 - 10th IEEE International Conference on Data Mining, ICDM 2010
Y2 - 14 December 2010 through 17 December 2010
ER -