Description

The success of science as an institution is largely a result of intellectual scaffolding and cumulative knowledge that leads to rapid innovation and significant societal returns. This drives and sustains a set of norms in which the claim to intellectual ownership of an idea or discovery is made by being the first to disseminate it publicly. A key benefit of this transparency is the reproducibility of scientific researcha critical element of peer-based quality control that maintains the reliability and reputation of science as a knowledge system. In recent decades, computation has evolved from tools for assisting scientific research to digital laboratories where fundamental scientific discoveries take place. This is increasingly the case in the social and ecological sciences, which use computational models to better understand social and earth
systems characterized by complex interactions and whose dynamics underlie many of the grand challenges faced by humanity today. The growing importance of computing in science makes it imperative that these same norms that support other scientific knowledge claims be extended to scientific software.

Intellectual Merit: The Big Data Spoke proposed here is designed to promote transparency and reproducibility in scientific computation and accelerate progress towards this goal through three interrelated components:
1. cyberinfrastructure to semi-automate and professionally reward the sometimes tedious efforts needed for archiving and disseminating code, associated datasets, metadata, and analyses workflows of scientific computation;
2. a linked, bibliometrics database (continuously updated by integrating automated citation searching with crowd-sourced review of sources) that will allow monitoring the on-going state of code archival and provide the basis for the BDSpoke to apply social incentives that can nudge scientists to publish code and associated documents;
3. a regional Working Group encompassing stakeholders from the developer, content provider, and user communities to provide the BDSpoke with advice and expertise for developing cyberinfrastructure, to develop community-wide strategies and standards for promoting reproducible science, and provide exemplar use-cases of models and data synthesis from the social and ecological sciences for testing
and improving the BDSpoke cyberinfrastructure.
This work leverages the intellectual and technical resources of the Western Big Data Innovation Hub to help us build on a decade of experience by the Network for Computational Modeling in the Social and Ecological Sciences (CoMSES Net) in developing and managing a large-scale research network for the computational modeling community and our expertise in the science of collective action.

Broader Impacts: The proposed BDSpoke will create an online environment that will serve as a comprehensive exemplar solution to the challenges of disseminating not just the results but the important processes of scientific computation that are needed for reproducibility. This environment will enable broad access by the scientific community and general public to important data resources that are being used in
social and ecological sciences to assess the consequences of alternative scenarios, policies, and assumptions. Increased transparency in scientific computing can accelerate cost-effective development of high quality models of complexly coupled human and natural systems, and improve knowledge scaffolding for
modeling across multiple domains of applications. The proposed bibliometrics database of model publications, linked with archived model code and associated documents, will provide a unique search tool to enable scientists to find relevant publications and modelsincluding accessible codeand an ongoing monitor of the ways in which computational modeling is applied to diverse topics within diverse disciplines.
Our Working Group and planned educational activities aim to train the next generation of scholars and develop a workforce that follows best practices on model reproducibility in scientific computation and data synthesis.
StatusFinished
Effective start/end date9/1/168/31/19

Funding

  • National Science Foundation (NSF): $1,014,593.00

Fingerprint

transparency
innovation
modeling
science
collective action
metadata
resource
quality control
ownership
train
incentive
stakeholder
software
code
monitoring
cost
document
norm
public
scientific research