Using the Loop Chain Abstraction to Schedule Across Loops in Existing Code
Exposing opportunities for parallelisation while explicitly managing data locality is the primary challenge to porting and optimising computational science simulation codes to improve performance. OpenMP provides mechanisms for expressing parallelism, but it remains the programmer's responsibility to group computations to improve data locality. The loop chain abstraction, where a summary of data access patterns is included as pragmas associated with parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality trade-off. We present the syntax and semantics of loop chain pragmas for indicating information about loops belonging to the loop chain and specification of a high-level schedule for the loop chain. We show example usage of the pragmas, detail attempts to automate the transformation of a legacy scientific code written with specific language constraints to loop chain codes, describe the compiler implementation for loop chain pragmas, and exhibit performance results for a computational fluid dynamics benchmark.
Bertolacci, Ian J.; Mills Strout, Michelle; Riley, Jordan; Guzik, Stephen M. J.; Davis, Eddie C.; and Olschanowsky, Catherine. (2019). "Using the Loop Chain Abstraction to Schedule Across Loops in Existing Code". International Journal of High Performance Computing and Networking, 13(1), 86-104. https://doi.org/10.1504/IJHPCN.2019.097053