Mechanical and Biomedical Engineering Faculty Publications and Presentations

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

Dana A. Jacobsen, Boise State University
Julien C. Thibault, Boise State UniversityFollow
Inanc Senocak, Boise State UniversityFollow

Document Type

Conference Proceeding

Publication Date

1-2010

Abstract

Modern graphics processing units (GPUs) with many-core architectures have emerged as general-purpose parallel computing platforms that can accelerate simulation science applications tremendously. While multi-GPU workstations with several TeraFLOPS of peak computing power are available to accelerate computational problems, larger problems require even more resources. Conventional clusters of central processing units (CPU) are now being augmented with multiple GPUs in each compute-node to tackle large problems. The heterogeneous architecture of a multi-GPU cluster with a deep memory hierarchy creates unique challenges in developing scalable and efficient simulation codes. In this study, we pursue mixed MPI-CUDA implementations and investigate three strategies to probe the efficiency and scalability of incompressible flow computations on the Lincoln Tesla cluster at the National Center for Supercomputing Applications (NCSA). We exploit some of the advanced features of MPI and CUDA programming to overlap both GPU data transfer and MPI communications with computations on the GPU. We sustain approximately 2.4 TeraFLOPS on the 64 nodes of the NCSA Lincoln Tesla cluster using 128 GPUs with a total of 30,720 processing elements. Our results demonstrate that multi-GPU clusters can substantially accelerate computational fluid dynamics (CFD) simulations.

Copyright Statement

This document was originally published by the American Institute of Aeronautics and Astronautics (AIAA) in 48th AIAA Aerospace Sciences Meeting proceedings. Copyright restrictions may apply.

Publication Information

Jacobsen, Dana A.; Thibault, Julien C.; and Senocak, Inanc. (2010). "An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters". 48th AIAA Aerospace Sciences Meeting Including The New Horizons Forum and Aerospace Exposition, .

Download

Included in

Mechanical Engineering Commons, Other Computer Engineering Commons

COinS

ScholarWorks

Mechanical and Biomedical Engineering Faculty Publications and Presentations

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

Document Type

Publication Date

Abstract

Copyright Statement

Publication Information

Included in

Browse

Links

Search

Author Corner

ScholarWorks

Mechanical and Biomedical Engineering Faculty Publications and Presentations

An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters

Authors

Document Type

Publication Date

Abstract

Copyright Statement

Publication Information

Included in

Share

Browse

Links

Search

Author Corner