2016 Volume E99.B Issue 7 Pages 1523-1531
A parallel Aho-Corasick (AC) approach, named PAC-k, is proposed for string matching in deep packet inspection (DPI). The proposed approach adopts graphic processing units (GPUs) to perform the string matching in parallel for high throughput. In parallel string matching, the boundary detection problem happens when a pattern is matched across chunks. The PAC-k approach solves the boundary detection problem because the number of characters to be scanned by a thread can reach the longest pattern length. An input string is divided into multiple sub-chunks with k characters. By adopting the new starting position in each sub-chunk for the failure transition, the required number of threads is reduced by a factor of k. Therefore, the overhead of terminating and reassigning threads is also decreased. In order to avoid the unnecessary overlapped scanning with multiple threads, a checking procedure is proposed that decides whether a new starting position is in the sub-chunk. In the experiments with target patterns from Snort and realistic input strings from DEFCON, throughputs are enhanced greatly compared to those of previous AC-based string matching approaches.