

"What's this?": Understanding User Interaction Behaviour with Multimodal Input Information Retrieval System
Silang Wang, Hyeongcheol Kim, Nuwan Janaka, Kun Yue, Hoang-Long Nguyen, Shengdong Zhao, Haiming Liu, Khanh-Duy Le
MobileHCI '24 Adjunct: Adjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction
"What's this?": Understanding User Interaction Behaviour with Multimodal Input Information Retrieval System
Silang Wang, Hyeongcheol Kim, Nuwan Janaka, Kun Yue, Hoang-Long Nguyen, Shengdong Zhao, Haiming Liu, Khanh-Duy Le
MobileHCI '24 Adjunct: Adjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction
Paper Abstract
Human communication relies on integrated multimodal channels to facilitate rich information exchange. Building on this foundation, researchers have long speculated about the potential benefits of incorporating multimodal input channels into conventional information retrieval (IR) systems to support users’ complex daily IR tasks more effectively. However, the true benefits of such integration remain uncertain. This paper presents a series of exploratory pilot tests comparing Multimodal Input IR (MIIR) with Unimodal Input IR (UIIR) across various IR scenarios, concluding that MIIR offers distinct advantages over UIIR in terms of user experiences. Our preliminary results suggest that MIIR could reduce the cognitive load associated with IR query formulation by allowing users to formulate different query-component in a unified manner across different input modalities, particularly when conducting complex exploratory search tasks in unfamiliar, in-situ contexts. The discussions stemming from this finding draw scholarly attention and suggest new angles for designing and developing MIIR systems.
Main Roles
Conceptualization
1. Conducted literature review in IR interface design, information needs, human-computer interaction, multimodal interaction, spatial cognition etc.
2. Read Mind in Motion by Barbara Tversky and Hand and Mind: What Gestures Reveal about Thought by David McNeill.
3. Designed a Wizard-of-Oz interaction flow on Microsoft HoloLens 2 with ChatGPT and Google search engine, where the wizard could observe the participants' first person views through their AR glasses via a Google Meet session maintained between them, and use ChatGPT to transcibe the first person view images and the spoken words of the participants into textual queries for Google search engine to return relevant search engine result pages to simulate a MIIR system.
Evaluation
1. Iteratively designed and conducted 13 versions of comparative pilot studies to evaluate the performance and user experience of MIIR system compared to UIIR system in different search scenarios, with four pilot studies eventually included in the publication.
2. Compiled a summary of the user experience and effectiveness of MIIR systems in different IR scenarios with preliminary qualitative and quantitative substantiation.