Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Derivatives of Regular Expressions with Lookahead
Takayuki MiyazakiYasuhiko Minamide
ジャーナル フリー

2019 年 27 巻 p. 422-430


Lookahead is an extension of regular expressions that has been adopted in many implementations and is widely used. Lookahead represents what is allowed as the rest of input. Morihata developed a conversion from regular expressions with lookahead (REwLA) to deterministic finite automata by extending Thompson's construction. In this paper, we develop a conversion from REwLA to deterministic finite automata by extending derivatives of regular expressions. First, we formalize the semantics of REwLA. An REwLA has information about the rest of the input, so the definition of the semantics of REwLA is not languages but structures different from those of regular expressions. Thus, we introduce languages with lookahead as sets of pairs of strings with several operations and define the semantics of REwLA as languages with lookahead. Next, we define two kinds of left quotient for languages with lookahead and give corresponding derivatives. Then, we show that the types of expressions obtained by repeatedly applying derivatives are finite under some equivalence relation and give a conversion to deterministic finite automata. We also show that the semantics of REwLA is a finite union of sets of the form A × B, where A and B are regular languages.

© 2019 by the Information Processing Society of Japan
前の記事 次の記事