Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
 
SHERRY: A Precise Approach based on Inter-set Similarity Algorithm for Vulnerable Code Clone Discovery
Reika Nishimura.ArakawaYo KanemotoMitsuaki Akiyama
著者情報
ジャーナル フリー

2025 年 33 巻 p. 537-551

詳細
抄録

The reuse of third-party code, such as open-source software (OSS), enhances software development efficiency but may introduce vulnerabilities that pose significant risks to systems. This paper focuses on known vulnerabilities originating from reused code, referred to as “code clone” (CC), with the specific term vulnerable CC used to denote vulnerable fragment. Previous studies only detect vulnerable CCs that are almost exactly matched or within a limited scope in the inspected software. In this paper, we developed SHERRY, a precise approach to detecting vulnerable CCs. It enables the detection of vulnerable CCs that are not precisely matched by converting the function code into a fine-grained set of features consisting of line-by-line elements. For scalability, SHERRY reduces comparisons and calculations similarity using logical operations. Furthermore, We analyzed 50 high-profile OSS projects, tracking vulnerable CCs detected by SHERRY and examining how developers manage them. SHERRY improved recall by over 10% and accelerated processing time 17-fold without limiting scope in a comparison experiment with existing techniques using the same 10 OSS. Our measurements also revealed 87 vulnerable CCs in 22 OSS projects, and more than half of them were comparable to the most dangerous software weakness type. We finds that there are three causes of why vulnerable CCs remain in OSS repositories. Ultimately, we conclude with practical suggestions to prevent the propagation of vulnerable CCs in the OSS ecosystem.

著者関連情報
© 2025 by the Information Processing Society of Japan
前の記事 次の記事
feedback
Top