Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
 
A Machine Learning Based Three-step Framework for Malicious URL Detection
Qisheng ChenKazumasa Omote
Author information
JOURNAL FREE ACCESS

2024 Volume 32 Pages 1105-1113

Details
Abstract

Malicious URL is a security problem that has plagued the Internet for a long time. Previously, people usually used the method of establishing blacklists to distinguish between malicious URLs and benign URLs, but to solve the shortcomings of using blacklist method to detect malicious URLs, such as slow update speed, the research of using machine learning to detect malicious URLs is increasing. These research projects have proposed their own methods and obtained great accuracy, but the summary research on malicious URLs detection is insufficient. In this paper, we propose a three-step framework: Segmentation step, Embedding step and Machine Learning step, for malicious URLs detection, which makes sense for systematically summarizing different machine learning based malicious URL detection methods. We overview 14 related works by our three-step framework and find that almost all research on malicious URLs detection using machine learning can be classified by the three-step framework. We evaluate some context-considering methods, the methods that consider the corpus's context during the vector generation, and machine learning models to test their suitability using our three-step framework. According to the results, we verify the importance of considering context and find that context-considering embedding methods are more important and the malicious URLs detection accuracy improved with context-considering methods.

Content from these authors
© 2024 by the Information Processing Society of Japan
Previous article Next article
feedback
Top