Publication Date


Date of Final Oral Examination (Defense)


Type of Culminating Activity


Degree Title

Master of Science in Computer Science


Computer Science

Major Advisor

Elena Sherman, Ph.D.


Jim Buffenbarger, Ph.D.


Bogdan Dit, Ph.D.


Probabilistic Symbolic Execution (PSE) extends Symbolic Execution (SE), a path-sensitive static program analysis technique, by calculating the probabilities with which program paths are executed. PSE relies on the ability of the underlying symbolic models to accurately represent the execution paths of the program as the collection of input values following these paths. While researchers established PSE for numerical data types, PSE for complex data types such as strings is a novel area of research.

For string data types SE tools commonly utilize finite state automata to represent a symbolic string model. Thus, PSE inherits from SE automata-based symbolic string models to calculate the probabilities of string-based constraints describing program paths. However, to our knowledge, there is lack of research on suitability of automata-based symbolic string models in the context of PSE.

This thesis proposes four automata-based symbolic string models for PSE and analyzes their suitability using two criteria: accuracy and performance. We compare the probability computed by the model to the actual probability and the amount of time took to compute it. Our results show that each model vary in their accuracy, however none is able to consistently compute actual value. In addition, our evaluation did reveal that this amount of inaccuracy depends upon the characteristics of a software program. From these finding we suggest guidance when selecting an automaton model for PSE based on the performance and accuracy requirements and the characteristics of the program under analysis. Additionally, we suggest future areas of research to the accuracy and performance deficiencies observed in our evaluation.