Assessing human-like artificial-agents from dynamical models of human perceptual-motor behaviour
Robust human-artificial agent interaction requires that both co-actors mutually respond and adapt to each other and to changes in task constraint. However, the degree to which a human and an artificial agent (AA) form an effective synergy greatly depends on whether the AA is capable of seamless and natural interaction with the human. Recently, machine- or deep-reinforcement learning (DRL) based algorithms have dominated the field of artificial intelligence and have been shown to train AAs to perform various tasks at or above human levels of performance. However, the moment-to-moment actions/movements of DRL-agents are typically not human-like and can often be easily discerned from human agents from visual observation of the AA engaged in task behavior. Accordingly, developing interactive AAs using many standard DRL methods can have undesired consequences for human-AA interaction and training systems if the spatiotemporal patterning of AA behavior is overlooked. Key to developing AAs capable of robust human-AA interaction is ensuring that the action capabilities of AAs can effectively match or complement those of human actors. Despite the assumed complexity of human task behavior, a growing body of research has demonstrated that most human perceptual-motor behaviors can be modelled using a small set of dynamical perceptual-motor primitives (DPMPs), which correspond to the fundamental functions of nonlinear dynamical systems, namely (i) point-attractor and (ii) limit-cycle functions, with the former capable of capturing discrete movements (e.g., tapping a key), and the latter capable of capturing rhythmic movements (e.g., walking). Motivated by this previous research, the current thesis investigated: [1] Whether AAs whose movement and action dynamics are defined by DPMP models exhibit behavior that is (1a) comparable to human behavior, (1b) as effective as an expert-human trainer in training novice humans in task performance, and (1c) more human-like and more effective than DRL agents at training novice humans in task performance; [2] How the strengths of DPMP and DRL models could be combined in hybrid models to achieve higher performance and shorter training times; [3] How assessing differences in the parametrization of a task-dynamical model of human navigation could elucidate differences in human and DRL-agent behavior. This thesis investigated these questions across four papers using two task contexts – a multiagent herding task and a single-actor navigation task. Paper I assessed the degree to which DPMP-agents can be employed for human team training by examining whether a DPMP model of human herding behavior could be embedded in the control architecture of an AA to train novice human actors to learn a multiagent herding task at a level comparable to a human-expert trainer. Paper II extended the research in the first paper by comparing the training outcomes from three different types of AAs: two DPMP-based agents and one DRL-agent. This second paper also examined participants’ subjective preferences for the three different agents and the relationship between these preferences and human-AA performance. Paper III examined the effectiveness of hybrid dynamical-DRL models using the herding task and demonstrated how this hybrid approach has the potential to significantly improve both training time and agent performance in comparison to non-hybrid dynamical and DRL-agents. Paper IV investigated human and DRL-agent trajectories through various obstacleridden environments. A DPMP model (Fajen and Warren, 2003) was then used to fit each of the raw trajectories by modulating key model parameters. Agent-specific parameters were then used to simulate new trajectories which were analyzed to identify differences in human and DRL-agent route selection and obstacle avoidance tuning. Overall, the results demonstrated that although DRL-agents can produce behavior similar to humans for simple task behaviors (i.e., route navigation), the behavior of DRL- agents is less human-like and less effective than DPMP-agents in more complex human-AA interaction and training contexts. In contrast, AAs developed using DPMP and hybrid dynamical-DRL models are not only capable of robust, ‘human-like’ behavioral interaction, but are preferred over DRL-agents, and have the potential to provide the same level of adaptive training as expert-human trainers.