2020 Volume E103.D Issue 10 Pages 2104-2112
The security and reliability of Arabic text exchanged via the Internet have become a challenging area for the research community. Arabic text is very sensitive to modify by malicious attacks and easy to make changes on diacritics i.e. Fat-ha, Kasra and Damma, which are represent the syntax of Arabic language and can make the meaning is differing. In this paper, a Hybrid of Natural Language Processing and Zero-Watermarking Approach (HNLPZWA) has been proposed for the content authentication and tampering detection of Arabic text. The HNLPZWA approach embeds and detects the watermark logically without altering the original text document to embed a watermark key. Fifth level order of word mechanism based on hidden Markov model is integrated with digital zero-watermarking techniques to improve the tampering detection accuracy issues of the previous literature proposed by the researchers. Fifth-level order of Markov model is used as a natural language processing technique in order to analyze the Arabic text. Moreover, it extracts the features of interrelationship between contexts of the text and utilizes the extracted features as watermark information and validates it later with attacked Arabic text to detect any tampering occurred on it. HNLPZWA has been implemented using PHP with VS code IDE. Tampering detection accuracy of HNLPZWA is proved with experiments using four datasets of varying lengths under multiple random locations of insertion, reorder and deletion attacks of experimental datasets. The experimental results show that HNLPZWA is more sensitive for all kinds of tampering attacks with high level accuracy of tampering detection.