Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management

Jia Zhang, Qihao Bao, Xiaoyi Duan, Shiyong Lu, Lijun Xue, Runyu Shi, Pingbo Tang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.

Original languageEnglish (US)
Title of host publicationProceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages219-228
Number of pages10
ISBN (Electronic)9781509046072
DOIs
StatePublished - Jan 6 2017
Event2nd IEEE International Conference on Collaboration and Internet Computing, IEEE CIC 2016 - Pittsburgh, United States
Duration: Nov 1 2016Nov 3 2016

Other

Other2nd IEEE International Conference on Collaboration and Internet Computing, IEEE CIC 2016
CountryUnited States
CityPittsburgh
Period11/1/1611/3/16

Fingerprint

workflow
infrastructure
Chemical analysis
management
Algebra
Internet
expertise
interaction
Big data

Keywords

  • Big data analytics
  • Collaborative provenance
  • Collaborative workflow design
  • Scientific workflow

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Safety, Risk, Reliability and Quality
  • Sociology and Political Science

Cite this

Zhang, J., Bao, Q., Duan, X., Lu, S., Xue, L., Shi, R., & Tang, P. (2017). Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management. In Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016 (pp. 219-228). [7809710] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CIC.2016.37

Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management. / Zhang, Jia; Bao, Qihao; Duan, Xiaoyi; Lu, Shiyong; Xue, Lijun; Shi, Runyu; Tang, Pingbo.

Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016. Institute of Electrical and Electronics Engineers Inc., 2017. p. 219-228 7809710.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, J, Bao, Q, Duan, X, Lu, S, Xue, L, Shi, R & Tang, P 2017, Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management. in Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016., 7809710, Institute of Electrical and Electronics Engineers Inc., pp. 219-228, 2nd IEEE International Conference on Collaboration and Internet Computing, IEEE CIC 2016, Pittsburgh, United States, 11/1/16. https://doi.org/10.1109/CIC.2016.37
Zhang J, Bao Q, Duan X, Lu S, Xue L, Shi R et al. Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management. In Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016. Institute of Electrical and Electronics Engineers Inc. 2017. p. 219-228. 7809710 https://doi.org/10.1109/CIC.2016.37
Zhang, Jia ; Bao, Qihao ; Duan, Xiaoyi ; Lu, Shiyong ; Xue, Lijun ; Shi, Runyu ; Tang, Pingbo. / Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management. Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 219-228
@inproceedings{5faccca908d2417f98941f6d9a8dd223,
title = "Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management",
abstract = "The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.",
keywords = "Big data analytics, Collaborative provenance, Collaborative workflow design, Scientific workflow",
author = "Jia Zhang and Qihao Bao and Xiaoyi Duan and Shiyong Lu and Lijun Xue and Runyu Shi and Pingbo Tang",
year = "2017",
month = "1",
day = "6",
doi = "10.1109/CIC.2016.37",
language = "English (US)",
pages = "219--228",
booktitle = "Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Collaborative scientific workflow composition as a service- An infrastructure supporting collaborative data analytics workflow design and management

AU - Zhang, Jia

AU - Bao, Qihao

AU - Duan, Xiaoyi

AU - Lu, Shiyong

AU - Xue, Lijun

AU - Shi, Runyu

AU - Tang, Pingbo

PY - 2017/1/6

Y1 - 2017/1/6

N2 - The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.

AB - The need for collaborative data analytics increases significantly when confronted with the challenges of big data. Although workflow tools offer a formal way to define, automate, and repeat multi-step computational procedures, designing complex data processing workflow requires collaboration from multiple people with complementary expertise. Existing tools are not suitable to support collaborative design of comprehensive workflows. To address such a challenge, this paper reports the design and development of a software infrastructure with the capability of supporting collaborative data-oriented workflow composition and management, adding a key component to existing cyberinfrastructure that will support big data collaboration through the Internet. A collaborative provenance query model (CPM) is presented together with graph-based patterns and algebra. A hypergraph theory-based provenance mining technique is reported. The research extends an existing opensource workflow tool, by adding system-level facilities to support human interaction and cooperation that are essential for an effective and efficient scientific collaboration.

KW - Big data analytics

KW - Collaborative provenance

KW - Collaborative workflow design

KW - Scientific workflow

UR - http://www.scopus.com/inward/record.url?scp=85013151794&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85013151794&partnerID=8YFLogxK

U2 - 10.1109/CIC.2016.37

DO - 10.1109/CIC.2016.37

M3 - Conference contribution

AN - SCOPUS:85013151794

SP - 219

EP - 228

BT - Proceedings - 2016 IEEE 2nd International Conference on Collaboration and Internet Computing, IEEE CIC 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -