Publication Date


Type of Culminating Activity


Degree Title

Master of Science in Computer Science


Computer Science

Major Advisor

Inanc Senocak


Amit Jain


Timothy Barth


Today’s Graphics Processor Units (GPU) are powerful computation platforms used not only for graphic rendering but also for multi-purpose computation. Now reaching a teraflops of peak performance and over a 100 GB/sec of bandwidth, GPUs outperform the latest CPUs and provide a new high-performance computing platform. New languages such as CUDA and Brook+ allow developers to target the programmable unit of the GPUs without a graphics programming background. Scientists and engineers in various fields have started benefiting from the last generations of GPUs. In this thesis, the implementation of a Navier-Stokes solver for incompressible flow around urban-like domains is presented. Transport and dispersion of contaminants in urban environments is an area of intense research. The computational fluid dynamic (CFD) models necessary to provide realistic simulations require heavy computation, usually only possible on CPU clusters. This thesis presents the base for an urban dispersion model implementation on desktop platforms, using one or multiple GPUs as coprocessors. The governing equations implemented for this thesis are common to many problems in CFD where flow motion is involved. Using a single Tesla C870 GPU card, the CUDA implementation of the lid-driven cavity problem runs 33 times faster than a serial C code running on a single core of an AMD Opteron 2.4GHz processor. A speedup of 100 was reached by associating the Tesla S870 quad-GPU system to a quad-core CPU machine. Computations for both GPU and CPU are single precision. A more complex application including obstacle capability was developed to model building effects in the domain. Using the quad-GPU system, the flow-field in a domain of 1.28 km × 1.28 km × 320 m was computed. A low Reynolds number flow-field projection of 22 minutes (1000 time steps) could be simulated in 3 minutes. Results show that an urban dispersion is feasible on this type of platform and that models can be run within minutes to provide emergency responses. More generally, it shows that complex CFD problems can benefit from multi-GPU desktop architectures.