TY - GEN
T1 - Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management
AU - Zhang, Jia
AU - Bao, Qihao
AU - Duan, Xiaoyi
AU - Lu, Shiyong
AU - Xue, Lijun
AU - Shi, Runyu
AU - Tang, Pingbo
N1 - Funding Information:
This work is partially supported by National Science Foundation, under grant NSF ACI-1443069. We appreciate Alexandros Mavrogiannis and Grivan Thapar for their software development efforts.
Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/6
Y1 - 2017/1/6
N2 - The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.
AB - The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.
KW - Big data analytics
KW - Collaborative provenance
KW - Collaborative workflow design
KW - Scientific workflow
UR - http://www.scopus.com/inward/record.url?scp=85013151794&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013151794&partnerID=8YFLogxK
U2 - 10.1109/CIC.2016.37
DO - 10.1109/CIC.2016.37
M3 - Conference contribution
AN - SCOPUS:85013151794
T3 - Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016
SP - 219
EP - 228
BT - Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd IEEE International Conference on Collaboration and Internet Computing, IEEE CIC 2016
Y2 - 1 November 2016 through 3 November 2016
ER -