Document Type

Conference Proceeding

Publication Date



In this paper we seek to understand how people interpret a social robot’s performance of an emotion, what we term ‘affective display,’ and the positive or negative valence of that affect. To this end, we tasked annotators with observing the Anki Cozmo robot perform its over 900 pre-scripted behaviors and labeling those behaviors with 16 possible affective display labels (e.g., interest, boredom, disgust, etc.). In our first experiment, we trained a neural network to predict annotated labels given multimodal information about the robot’s movement, face, and audio. The results suggest that pairing affects to predict the valence between them is more informative, which we confirmed in a second experiment. Both experiments show that certain modalities are more useful for predicting displays of affect and valence. For our final experiment, we generated novel robot behaviors and tasked human raters with assigning scores to valence pairs instead of applying labels, then compared our model’s predictions of valence between the affective pairs and compared the results to the human ratings. We conclude that some modalities have information that can be contributory or inhibitive when considered in conjunction with other modalities, depending on the emotional valence pair being considered.

Copyright Statement

This document was originally published in Proceedings of Robotics: Science and Systems by Robotics Science and Systems. Copyright restrictions may apply.