Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms for expressing parallelism, but it primarily remains the programmer’s responsibility to group computations to improve data locality. The loopchain abstraction, where data access patterns are included with the specification of parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality tradeoff. In this paper, we present a loop chain pragma and an extension to the omp for to enable the specification of loop chains and high-level specifications of schedules on loop chains. We show example usage of the extensions, describe their implementation, and show preliminary performance results for some simple examples.
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. doi: 10.1109/WACCPD.2016.010
Bertolacci, Ian J.; Strout, Michelle Mills; Guzik, Stephen; Riley, Jordan; and Olschanowsky, Catherine. (2016). "Identifying and Scheduling Loop Chains Using Directives". Proceedings of WACCPD 2016: Third Workshop on Accelerator Programming Using Directives, 57-67. http://dx.doi.org/10.1109/WACCPD.2016.010