A Dynamic Loop Detection, Profiling and Optimization Tool

Publication Date


Type of Culminating Activity


Degree Title

Master of Science in Computer Science


Computer Science

Major Advisor

Gang-Ryung Uh


With the evolution of multi-core, multi-threaded processors from simple-scalar processors, the performance gains achieved by sequential programs will no longer be significant. This advances in technology places a greater responsibility on programmers for performance gains, resource utilization and program optimization. As a result, programmers will have to rely on thread-level parallelism (TLP), vectorization in addition to instruction-level parallelism (ILP) for improving performance.

To exploit TLP, applications need to be multi-threaded. The existing sequential applications can be made multi-threaded, by either manual, automatic or semi-automatic techniques. Manual techniques have been the predominant method to exploit TLP. Deciding which regions can be parallelized is difficult, especially in large, complex applications. Moreover, to achieve maximum performance gains, focus on coarse-grain parallelism is necessary. Loops have been the easy targets for parallelization and coarse-level parallelism implies that, efforts to parallelize the loop has to be moved to higher loop hierarchies.

In this work, a tool is developed which can detect loops presenting a complete hierarchical structure and collects detailed profile information which can assist the programmers to identify potential regions for optimization and parallelization.

Files over 30MB may be slow to open. For best results, right-click and select "save as..."