Replay Attack Detection Based on Spatial and Spectral Features of Stereo Signal

Ryoya Yaguchi; Sayaka Shiota; Nobutaka Ono; Hitoshi Kiya

doi:10.2197/ipsjjip.29.275

Abstract

In this paper, we propose a replay attack detection (RAD) method that uses spatial and spectral features of a stereo signal. To distinguish genuine and replayed utterance, we focus on non-speech segments, in which a human does not emit sound, but a loudspeaker for replay attack might emit some recorded noise or its electromagnetic noise. The generalized cross-correlation (GCC) based spatial features capture this difference. To improve the robustness against the variety of recording environments, we combine the spatial features with spectral features. In particular, we fuse the output scores of GCC-based and spectral feature-based methods. In experiments, we confirm the effectiveness of the combination of spatial and spectral features.

Content from these authors

Favorites & Alerts

Add to favorites
Additional info alert
Citation alert
Authentication alert

Corresponding author

Register with J-STAGE for free!