Analysis of gender differences in speech and hand gesture coordination for the design of multimodal interface systems
thesisposted on 2022-03-28, 12:48 authored by Jing Liu
The research studies about Multimodal Interface Systems (MMIS) involving speech and hand gestures have intensified in the past three decades. Understanding the correlations of speech and hand gestures has gained significance in MMIS design. Gesture is known to correlate with speech in a number of levels in general, but less is known about the gender differences in this kind of correlation. When users interact multimodally with MMIS, we hypothesise that there are gender differences in the coordination of speech and hand gestures internally and externally. The investigation of such user related factors can benefit MMIS through accommodating gender adaptive processing strategies for different gender groups which can potentially improve the system performance. The main methodology used in this thesis is video annotation, including hand gesture annotation and speech annotation, to identify the gender differences in the descriptions of two objects using speech and hand gestures. Our aim is to search for answers to the following questions: Firstly, are there any gender differences in the coordination of speech and hand gestures? We found that females use more hand gestures than males for the same task. This may imply that females and males have different preferences in using speech and gestural modalities in MMIS. The temporal integration patterns are similar for males and females, but the temporal alignment intervals of gesture strokes and corresponding lexical affliates are shorter for females than males. Secondly, do males and females employ different cognitive processing models in the coordination of speech and hand gestures? Our findings demonstrated that males and females have different distribution in cognitive actions. In general, males have more perceptual actions than functional actions, while females have more functional actions than perceptual actions. Gender differences in cognitive processing models might be the reason for the differences in the distribution of word types accompanying hand gestures. This implies that MMIS can potentially achieve better performance if information processing strategies are designed for different gender groups. Thirdly, are there any differences in brain activities of males and females, when speech accompanies hand gestures? Our findings showed that the differences in lateralisation of brain activities associated with speech and hand gestures are quite minor in gender. However, we found that females show stronger beta spectral moment and more significant changes in spectral moment from alpha to beta band. This may explain the shorter temporal alignment of speech and hand gestures for females. We demonstrated that gender differences in speech and hand gestures occur both internally (in cognitive processing and brain activities) and externally (in the presentation of speech and hand gestures). Based on the external differences, we developed models to predict the gender of users by evaluating their multimodal actions (using decision tree, neural network and logistic regression respectively). Our results show that a reasonable performance can be achieved by logistic regression model with an accuracy over 70%. Thus, we demonstrated that various gender prediction models can be successfully implemented using our findings and our results are promising for the design of gender adaptive MMIS.