TY - GEN
T1 - M2td
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
AU - Li, Xinsheng
AU - Candan, Kasim
AU - Sapino, Maria Luisa
N1 - Funding Information:
Research is supported by NSF#1318788 “Data Management for Real-Time Data Driven Epidemic Spread Simulations”, NSF#1339835 “E-SDMS: Energy Simulation Data Management System Software”, NSF#1610282 “DataStorm: A Data Enabled System for End-to-End Disaster Planning and Response”, NSF#1633381 “BIGDATA: Discovering Context-Sensitive Impact in Complex Systems”, and “FourCmodeling”: EU-H2020 Marie Sklodowska-Curie grant agreement No 690817.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - Data-And model-driven computer simulations are increasingly critical in many application domains. These simulations may track 10s or 100s of parameters, affected by complex inter-dependent dynamic processes. Moreover, decision makers usually need to run large simulation ensembles, containing 1000s of simulations. In this paper, we rely on a tensor-based framework to represent and analyze patterns in large simulation ensemble data sets to obtain a high-level understanding of the dynamic processes implied by a given ensemble of simulations.We, further, note that the inherent sparsity of the simulation ensembles (relative to the space of potential simulations one can run) constitutes a significant problem in discovering these underlying patterns. To address this challenge, we propose a partition-stitch sampling scheme, which divides the parameter space into subspaces to collect several lower modal ensembles, and complement this with a novel Multi-Task Tensor Decomposition (M2TD), technique which helps effectively and efficiently stitch these subensembles back. Experiments showed that, for a given budget of simulations, the proposed structured sampling scheme leads to significantly better overall accuracy relative to traditional sampling approaches, even when the user does not have a perfect information to help guide the structured partitioning process.
AB - Data-And model-driven computer simulations are increasingly critical in many application domains. These simulations may track 10s or 100s of parameters, affected by complex inter-dependent dynamic processes. Moreover, decision makers usually need to run large simulation ensembles, containing 1000s of simulations. In this paper, we rely on a tensor-based framework to represent and analyze patterns in large simulation ensemble data sets to obtain a high-level understanding of the dynamic processes implied by a given ensemble of simulations.We, further, note that the inherent sparsity of the simulation ensembles (relative to the space of potential simulations one can run) constitutes a significant problem in discovering these underlying patterns. To address this challenge, we propose a partition-stitch sampling scheme, which divides the parameter space into subspaces to collect several lower modal ensembles, and complement this with a novel Multi-Task Tensor Decomposition (M2TD), technique which helps effectively and efficiently stitch these subensembles back. Experiments showed that, for a given budget of simulations, the proposed structured sampling scheme leads to significantly better overall accuracy relative to traditional sampling approaches, even when the user does not have a perfect information to help guide the structured partitioning process.
KW - Simulation
KW - Tensor
KW - Tensor Decomposition
UR - http://www.scopus.com/inward/record.url?scp=85057089216&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057089216&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2018.00106
DO - 10.1109/ICDE.2018.00106
M3 - Conference contribution
AN - SCOPUS:85057089216
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 1156
EP - 1167
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 April 2018 through 19 April 2018
ER -