2025 Volume 33 Pages 815-825
Service identification from TLS-encrypted IP flows is useful in several ways, such as providing zero-rating service. In the case of TLS 1.2 or lower, the SNI fields are not encrypted and can be analyzed for service identification, then, for flows with TLS 1.2 or less, a method was proposed to identify the service based on SNI occurrences. The method first investigates the relationship between the SNI occurrences and the services being accessed. It then identifies the service from the IP flows based on the SNI occurrence using Bayesian inference. In this paper, we focus on this method, and discuss its improvement in identification accuracy. We then show that the giving up identification caused by the absence of the SNI occurrence pattern in the previously created database is one of the main reasons for the decrease in accuracy. To solve this problem, we propose to exclude SNIs from information basing identification according to entropy. The proposed method excludes SNIs in the order of increasing the amount of relative entropy reduce by SNIs until the identification is not given up. We compare the accuracy of the existing and proposed methods and show that the proposed method improves the identification accuracy.