We propose a method for temporal alignments--a precondition of meaningful fusions--of multimodal systems, using the incremental unit dialogue system framework, which gives the system flexibility in how it handles alignment: either by delaying a modality for a specified amount of time, or by revoking (i.e., backtracking) processed information so multiple information sources can be processed jointly. We evaluate our approach in an offline experiment with multimodal data and find that using the incremental framework is flexible and shows promise as a solution to the problem of temporal alignment in multimodal systems.
This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at ICMI 2017: Proceedings of the 19th ACM International Conference on Multimodal Interaction, published by the Association for Computing Machinery. Copyright restrictions may apply. doi: 10.1145/3136755.3136769
Kennington, Casey; Han, Ting; and Schlangen, David. (2017). "Temporal Alignment Using the Incremental Unit Framework". ICMI 2017: Proceedings of the 19th ACM International Conference on Multimodal Interaction, 297-301. http://dx.doi.org/10.1145/3136755.3136769