Publication Date

5-2023

Date of Final Oral Examination (Defense)

5-5-2023

Type of Culminating Activity

Thesis

Degree Title

Master of Science in Computer Science

Department

Computer Science

Major Advisor

Catherine Olschanowsky, Ph.D.

Advisor

Jim Buffenbarger, Ph.D.

Advisor

Alejandro N. Flores, Ph.D.

Abstract

Hydrologists must process many gigabytes of data for hydrologic simulations, which takes time and resources degrading performance. The performance issues are caused mainly by domain scientists’ preference for using Python, which trades performance for productivity. In my thesis, I demonstrate that using the static compilation technique to compile Python to generate C code along with several optimizations reduces time and resources for hydrologic data processing. I developed a Domain Specific Library (DSL) which is a subset of Python and compiles to Sparse Polyhedral Framework - Intermediate Representation (SPF-IR), which allows opportunities for optimizations like read reduction fusion which are not available in Python. We fused the file I/O to perform computation on small chunks of data (stream computation) in order to reduce the memory footprint.

The C code we generated from SPF-IR shows an average speed-up of 2.58x over the existing hand-optimized implementations and can totally eliminate the temporary storage required. DSL users can still enjoy the ease of use of Python but get performance better than the C code.

DOI

https://doi.org/10.18122/td.2060.boisestate

Share

COinS