The impact of visual and auditory information on subtitle processing: an eye tracking study
Despite a growing body of work on subtitle processing, previous studies mostly only draw on global analyses of eye-movement measures, thus providing little insight into the lexical and post-lexical processing of subtitles. Moreover, a lot remains unknown regarding how the interaction and integration of multiple sources of information might affect the reading of subtitles. To shed light on these underexplored questions, this thesis presents a number of eye-tracking experiments that investigated the influence of visual and auditory information on the reading of subtitles, using both global and local (word-based) eye-movement measures.
It was found that concurrent background video content enhanced comprehension, which stands in contrast to the redundancy effect—that presenting the same information in multiple forms could interfere with learning (Kalyuga & Sweller, 2014). Concurrent video presentation also influenced participants’ eye-movement behaviours when reading subtitles, rendering fewer, shorter fixations as well as longer saccades relative to the condition without video presentation. Furthermore, when the video content was more congruent with the subtitle, participants made fewer inter-word regressions when reading subtitles and this effect was attenuated as subtitle speed increased.
The presence and nature of the auditory input also influenced the way people engaged in the reading of L2 (i.e., second language) subtitles. Overall, when auditory information reduced the necessity of reading (and understanding) the subtitles (e.g., from no audio to L2 audio to L1 audio), participants made fewer, short fixations, as well as longer saccades on the subtitles, as well as skipping more subtitles. Local analyses of eye-movement measures revealed that participants were able to gauge their need for the subtitles accurately and adapt their eye movements to this need. When provided with L1 audio, participants were found to engage in more post-lexical processing of subtitles compared to with L2 audio or without audio. However, L2 audio accentuated the post-lexical processing of subtitles, indicating that participants might use the audio to support the reading of subtitles to the extent that they could engage more with the text and process it more deeply.
Taken together, these findings have important implications for understanding the perceptual and cognitive processes that support reading in multimodal situations, such as reading subtitles in video. Eye-movement control—when and where to move the eyes—in multimodal reading situations is much more complicated than that in reading of static text. On the one hand, results suggest that eye movements during the reading of subtitles are still largely controlled by linguistic processing associated with word properties (e.g., word frequency and length), which is consistent with what is observed in the reading of static text. On the other hand, the decision about when and where to move eyes not only reflects the extent to which the perceptual and cognitive systems are responsive to task demands, but also reflects sophisticated metacognitive strategies employed by participants to adapt their eye-movement behaviours so as to maximize the contribution of different sources of information in the service of maintaining a desirable level of comprehension.