SHERRY: A Precise Approach based on Inter-set Similarity Algorithm for Vulnerable Code Clone Discovery

Reika Nishimura.Arakawa; Yo Kanemoto; Mitsuaki Akiyama

doi:10.2197/ipsjjip.33.537

抄録

The reuse of third-party code, such as open-source software (OSS), enhances software development efficiency but may introduce vulnerabilities that pose significant risks to systems. This paper focuses on known vulnerabilities originating from reused code, referred to as “code clone” (CC), with the specific term vulnerable CC used to denote vulnerable fragment. Previous studies only detect vulnerable CCs that are almost exactly matched or within a limited scope in the inspected software. In this paper, we developed SHERRY, a precise approach to detecting vulnerable CCs. It enables the detection of vulnerable CCs that are not precisely matched by converting the function code into a fine-grained set of features consisting of line-by-line elements. For scalability, SHERRY reduces comparisons and calculations similarity using logical operations. Furthermore, We analyzed 50 high-profile OSS projects, tracking vulnerable CCs detected by SHERRY and examining how developers manage them. SHERRY improved recall by over 10% and accelerated processing time 17-fold without limiting scope in a comparison experiment with existing techniques using the same 10 OSS. Our measurements also revealed 87 vulnerable CCs in 22 OSS projects, and more than half of them were comparable to the most dangerous software weakness type. We finds that there are three causes of why vulnerable CCs remain in OSS repositories. Ultimately, we conclude with practical suggestions to prevent the propagation of vulnerable CCs in the OSS ecosystem.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）