2021 Volume 29 Pages 174-187
One of the common problems with the recursive descent parsing method is that when parsing with a left-recursive grammar, the parsing does not terminate because the same parsing function is recursively invoked indefinitely without consuming the input string. Packrat parsing, which is a variant of recursive descent parsing method that handles grammars described in parsing expression grammars (PEGs) by backtracking, is also affected by the above problem. Although naive backtracking parsers may exhibit an exponential execution time, packrat parsers achieve a linear time complexity (for grammars that are not left-recursive) by memoizing the result of each call to the parsing functions. Some methods have been proposed to solve the problem of left recursion in packrat parsers. In these methods, memoization tables in packrat parsers are modified to limit the depth of the recursive calls. By calling the same parsing function repeatedly while increasing the limit, the parsed range in the input string is expanded gradually. These methods have problems in that multiple occurences of left-recursive calls at the same input position cannot be handled correctly, and some of the grammars that does not include left recursion cannot be handled. In this research, we propose and implement a new packrat parser to address these problems. This packrat parser can handle multiple occurences of left-recursive calls at the same position in the input by giving priority to the most recently used rule when gradually increasing the parsed range of the recursion. In the evaluation of the proposed method, in addition to the grammars including left recursion manageable by the methods proposed in existing studies, we confirmed that our approach supports the grammars that cannot be handled by those existing methods.