Project Details

Description

Through this CC* Computing project Arizona State University (ASU) proposes to acquire, deploy, operate, and maintain an advanced federated open research computing enclave (AFORCE) for use by all faculty, staff, and students within the Arizona tri-university system comprising ASU, The University of Arizona (UA), and Northern Arizona University (NAU), with additional capacity dedicated to the Open Science Grid (OSG) distributed computing effort.

Specifically, this CC* project will deploy a pool of 33 state-of-the-art NVIDIA A100 GPU devices for open research. To maximize the impact of this NSF investment the proposed GPUs will be operated within ASU's existing HPC infrastructure, leveraging existing login and management nodes as well as high-speed scratch storage. The 33 A100 GPUs will be deployed in 11 new Dell compute nodes that will be accessed from ASU's current HPC login nodes, managed by existing Slurm management nodes, and leverage existing high-speed scratch storage and InfiniBand switches.

Drawing upon ASUs mission to enable access to discovery and scholarship in science, engineering, and health, this project envisions a research and education environment with advanced computing capacity accessible to all students, staff, and faculty, including early-career faculty and researchers experiencing gaps in funding. In service of this vision, the goal of this project is to accelerate the process of discovery by providing access to GPU-enabled computing at a scale not currently possible. To achieve this goal the project comprises the following objectives: 1. Increase ASU's general use GPU capacity; 2. Develop a federated access mechanism for resource sharing across the Arizona Tri-University System; 3. Facilitate the use of throughput computing via OSG both locally at ASU and globally; 4. Ensure friction-free access to cloud computing resources from on-prem ASU computing systems.

Access to extramural researchers within the AZ tri-university system will be provisioned via InCommon. All three universities are InCommon members, and ASU HPC login nodes will be configured to authenticate via each university's respective identity provider.

ASU will facilitate local ASU use of OSG by incorporating awareness of OSG capabilities into regular Research Computing training sessions and faculty engagement events. In 2020 Research Computing hosted over 25 events with over 400 attendees. ASU has operated a dedicated htCondor pool of approximately 300 CPU cores since 2018, and through this project, ASU will configure the proposed GPU nodes with "htCondor Glide In," joining them to the global OSG computing pool and allowing them to process OSG compute jobs from all over the world. The AFORCE system will provide at a minimum 27% of its capacity to extramural research, including the AZ tri-university system and OSG.

Finally, through a collaboration with Google, ASU researchers will be able to leverage Google Cloud Platform during instances where extant ASU resources are either insufficient or incapable of fulfilling a researcher's workload. Researchers will be able to submit such workloads directly to GCP from exiting campus HPC login nodes. Additionally, the system will connect to GCP via the Internet2 CloudConnect service through the Sun Corridor Network, Arizona's state research and education network provider.
StatusActive
Effective start/end date10/1/219/30/23

Funding

  • National Science Foundation (NSF): $399,997.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.