Document Type

Conference Proceeding

Publication Date

2022

Abstract

Incremental dialogue processing has been an important topic in spoken dialogue systems research, but the broader research community that makes use of language interaction (e.g., chatbots, conversational AI, spoken interaction with robots) have not adopted incremental processing despite research showing that humans perceive incremental dialogue as more natural. In this paper, we extend prior work that identifies the requirements for making spoken interaction with a system natural with the goal that our framework will be generalizable to many domains where speech is the primary method of communication. The Incremental Unit framework offers a model of incremental processing that has been extended to be multimodal, temporally aligned, enables real-time information updates, and creates complex network of information as a fine-grained information state. One challenge is that multimodal dialogue systems often have computationally expensive modules, requiring computation to be distributive. Most importantly, when speech is the means of communication, it brings the added expectation that systems understand what they (humans) say, but also that systems understand and respond without delay. In this paper, we build on top of the Incremental Unit framework and make it amenable to a distributive architecture made up of a robot and spoken dialogue system modules. To enable fast communication between the modules and to maintain module state histories, we compared two different implementations of a distributed Incremental Unit architecture. We compare both implementations systematically then with real human users and show that the implementation that uses an external attribute-value database is preferred, but there is some flexibility in which variant to use depending on the circumstances. This work offers the Incremental Unit framework as an architecture for building powerful, complete, and natural dialogue systems, specifically applicable to robots and multimodal systems researchers.

Comments

HCII 2022: Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design is volume 13320 of the Lecture Notes in Computer Science book series.

Copyright Statement

This version of the article has been accepted for publication and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-031-06018-2_24

Publication Information

Imtiaz, Mir Tahsin and Kennington, Casey. (2022). "Incremental Unit Networks for Distributed, Symbolic Multimodal Processing and Representation". In V.G. Duffy (Ed.), HCII 2022: Digital Human Modeling and Applications in Health, Safety, Ergonomics and Risk Management. Health, Operations Management, and Design, (Lecture Notes in Computer Science series, Volume 13320, pp. 344-393). Springer. https://doi.org/10.1007/978-3-031-06018-2_24

Download

Find in your library

Included in

Computer Sciences Commons

COinS

ScholarWorks

Computer Science Faculty Publications and Presentations

Incremental Unit Networks for Distributed, Symbolic Multimodal Processing and Representation

Document Type

Publication Date

Abstract

Comments

Copyright Statement

Publication Information

Included in

Browse

Links

Search

Author Corner

ScholarWorks

Computer Science Faculty Publications and Presentations

Incremental Unit Networks for Distributed, Symbolic Multimodal Processing and Representation

Authors

Document Type

Publication Date

Abstract

Comments

Copyright Statement

Publication Information

Included in

Share

Browse

Links

Search

Author Corner