The effect of language and acoustic degradation on speech intelligibility in young adults
Background: Every day, humans listen to speech that varies in quality . The question the current project addressed is how the brain processes speech that is degraded and/or unintelligible. Previous literature has applied a filter called temporal response function, which describes the mapping between the speech envelope and its neural response to understand the association between speech intelligibility and brain processing. To advance the current understanding of neural encoding to speech, we investigated the effect of language and vocoded speech, which is an acoustically degraded signal in speech intelligibility. Listeners were exposed to acoustically intact and degraded speech in their native language, English as a Second language (ESL) and to an unknown language.
Methods: Thirty normal-hearing participants and young adults were recruited for this study. Fifteen participants were native English speakers (aged 20.1-35.1 years, 3 males), and another fifteen were native Thai speakers (aged 20.2-38.8 years, 5 males). All participants completed the language assessments (LexTALE test) to qualify for their English language proficiency. The EEG experiment included four listening conditions of twelve minutes each (two languages and two acoustic conditions). Each participant listened to English and Thai stories in 2 acoustic conditions: clear speech and 4-channel vocoding (less clear) speech. The EEG data was analysed by using the forward model of the temporal response function (TRF) framework to determine how best the speech envelope-brain mapping could predict acoustically degraded speech for native versus nonnative or unknown language.
Results: The results revealed a significant main effect of the acoustic degradation (F(1, 28) = 27.31, p < .001) as TRF predictive accuracy was significantly lower in the vocoded conditions. A one-way ANOVA was performed to investigate the effect of language proficiency. A Posthoc test (Tukey HSD) showed no significant difference between native and non-native speakers (p = .632) and between known and unknown languages (p = .992) in clear conditions. However, the results showed that there was a significant reduction in TRF predictive accuracy from clear to vocoded listening conditions (p < .001) in Thai native listening conditions but not for English native listening conditions (p = .678).
Conclusion: Altogether, our results showed that TRF predictive accuracy was similar when English speakers heard clear speech in an unknown language, indicating the importance of auditory processing. The TRF predictive accuracy of acoustically degraded speech of a tonal language, however, was significantly poorer, indicating the importance of language processing.