Publication Date


Date of Final Oral Examination (Defense)


Type of Culminating Activity


Degree Title

Master of Science in Computer Science


Computer Science

Major Advisor

Elena Sherman, Ph.D.


James Buffenbarger, Ph.D.


Bogdan Dit, Ph.D.


Software testing is an integral part of the software development process. To test certain parts of software, developers need to identify inputs that reach those parts. Data and control dependencies make this a non-trivial task, and as the complexity of software increases it becomes more difficult to manually derive such inputs. Due to complex data manipulations, this process is even more challenging for programs with string inputs, such as security applications. Thus, automated reachability test input generation for string data types is an important research area.

Symbolic Execution is a path-sensitive static program analysis technique that can automatically generate conditions for inputs that reach a given program location. Commonly, such conditions are encoded as automata that describe a set of strings at that location. Automata result from string operations applied to inputs along that path. However, these automata do not necessarily correspond to string inputs that result in string values at the program location. To find those input values, we need to undo the effects of string operations through backward analysis. The intricate relationships between symbolic string values complicate this process. These relationships are due to non-injective string operations and data-flow dependencies of string values.

This thesis presents a novel method for test input generation for automatabased string constraints. It uses single-track automata along with novel computational techniques to perform inverse string operations. Empirical evaluations on a set of benchmarks have shown this method to be effective in solving automatabased string constraints from real-world applications.