IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532

この記事には本公開記事があります。本公開記事を参照してください。
引用する場合も本公開記事を引用してください。

Deepfake speech detection: approaches from acoustic features related to auditory perception to deep neural networks
Masashi UNOKIKai LIAnuwat CHAIWONGYENQuoc-Huy NGUYENKhalid ZAMAN
著者情報
ジャーナル フリー 早期公開

論文ID: 2024MUI0001

この記事には本公開記事があります。
詳細
抄録

Skillfully fabricated artificial replicas of authentic media using advanced AI-based generators are known as “deepfakes.” Deepfakes have become a growing concern due to their increased distribution in cyber-physical spaces. In particular, deepfake speech, which is fabricated by using advanced AI-based speech analysis/synthesis techniques, can be abused for spoofing and tampering with authentic speech signals. This can enable attackers to commit serious offenses such as fraud by voice impersonation and unauthorized speaker verification. Our research project aims to construct the basis of auditory-media signal processing for defending against deepfake speech attacks. To this end, we introduce current challenges and state-of-the-art techniques for deepfake speech detection and examine current trends and remaining issues. We then introduce the basis of the acoustical features related to auditory perception and propose methods for detecting deepfake speech based on auditory-media signal processing consisting of these features and deep neural networks (DNNs).

著者関連情報
© 2024 The Institute of Electronics, Information and Communication Engineers
feedback
Top