Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Detecting Unseen Malicious VBA Macros with NLP Techniques
Mamoru MimuraHiroya Miura
Author information
JOURNAL FREE ACCESS

2019 Volume 27 Pages 555-563

Details
Abstract

In recent years, the number of targeted email attacks which use Microsoft (MS) document files has been increasing. In particular, malicious VBA (Visual Basic for Applications) macros are often contained in the MS document files. Some researchers proposed methods to detect malicious MS document files. However, there are a few methods to analyze malicious macros themselves. This paper proposes a method to detect unseen malicious macros with the words extracted from the source code. Malicious macros tend to contain typical functions to download or execute the main body, and obfuscated strings such as encoded or divided characters. Our method represents feature vectors from the corpus with several NLP (Natural Language Processing) techniques. Our method then trains the extracted feature vectors and labels with basic classifiers, and the trained classifiers predict the labels from unseen macros. Experimental results show that our method can detect 89% of new malware families. The best F-measure achieves 0.93.

Content from these authors
© 2019 by the Information Processing Society of Japan
Previous article Next article
feedback
Top