Counter measure system for automatic speaker verification systems
Automatic speaker verification systems use speech signals to identify an individual. The research in speaker recognition systems has advanced a lot in the last few decades. Even though the state-of-the-art systems provide superior authentication performance, they are still vulnerable to malicious spoofing attacks. Access to speaker verification systems can be spoofed using various types of different attacks. The three main types of attacks are replay attacks, speech synthesis and voice conversion. Current literature shows that these attacks significantly increase the false acceptance rate of speaker verification systems. Concrete evidence of this vulnerability has directed researchers in building countermeasures for speaker recognition systems. The focus of this thesis is on the detection of speech synthesis and voice conversion based spoofing attacks. We explore short-term Fourier transform (STFT) and constant Q transform (CQT) as front-end features for countermeasure systems. The motive is to understand the effect of different parameterisation of a feature on countermeasure performance. Additionally, we also explore how different back-end classifiers affect the performance of countermeasure systems. Finally, score level fusions of different single feature-based countermeasure systems are investigated.