Investigating Semantic Properties of Images Generated from Natural Language Using Neural Networks
Date of Final Oral Examination (Defense)
Type of Culminating Activity
Master of Science in Computer Science
Dr. Bogdan Dit, Ph.D.
Dr. Casey Kennington, Ph.D.
Dr. Francesca Spezzano, Ph.D.
This work explores the attributes, properties, and potential uses of generative neural networks within the realm of encoding semantics. It works toward answering the questions of: If one uses generative neural networks to create a picture based on natural language, does the resultant picture encode the text's semantics in a way a computer system can process? Could such a system be more precise than current solutions at detecting, measuring, or comparing semantic properties of generated images, and thus their source text, or their source semantics?
This work is undertaken in the hope that detecting previously unknown properties, or better understanding them, could lead to new or improved methods of encoding and processing semantics in a computer system. Improvements in this space could affect many systems that make semantically based decisions. Being able to detect general or specific semantic properties, semantic similarity, or other semantic properties more effectively could improve tasks such as information retrieval, question answering, duplication (clone) detection, sentiment analysis, and others. Additionally, it could provide insight into how to better represent semantics in computer systems and thus bring us closer to general artificial intelligence.
To explore this space, this work starts with an experiment consisting of transforming pairs of texts into pairs of images via a generative neural network and exploring properties of those image pairs. The text pairs were known to either be textually and semantically identical, semantically similar, or semantically dissimilar. The resultant image pairs are then tested for similarity via a second neural network based process to investigate if the semantic similarity is preserved during the transformation process and thus, exists in the resultant image pairs in a quantifiable way.
Preliminary results showed strong evidence of resultant images encoding semantics in a measurable way. However, when the experiment was conducted on a larger dataset, and with the generative network more thoroughly trained, the results are weaker. An alternative experiment conducted on different datasets and configurations produced results that are still weaker than the preliminary experiments. These findings lead us to believe the promise of the preliminary results was possibly due to semantics being encoded by the vectorization of the words, and not by the generative neural network. This explanation seeks to clarify why, as the generative neural network took a larger role in the process, the results were worse, and as it took a smaller role, the results were better. Further tests were conducted to establish this belief and proved supportive.
Schrader, Samuel Ward, "Investigating Semantic Properties of Images Generated from Natural Language Using Neural Networks" (2019). Boise State University Theses and Dissertations. 1561.