2015 Volume E98.A Issue 6 Pages 1197-1211
A term tree pattern is a rooted ordered tree pattern which consists of ordered tree structures with edge labels and structured variables with labels. All variables with the same label in a term tree pattern can be simultaneously replaced with ordered trees isomorphic to the same rooted ordered tree. Then, a term tree pattern is suitable for representing structural features common to tree structured data such as XML documents on the web, the secondary structures of RNA in biology and parse trees describing grammatical structures of natural languages. Let $\ott$ be the set of all term tree patterns which have one or more variables with the same label. Let $\lott$ be the set of all term tree patterns t such that all variables in t have distinct labels. We remark that $\lott\subsetneq \ott$ holds. In this paper, we consider a problem, called Matching problem for term tree patterns, of deciding whether or not a given rooted ordered tree T is obtained from a given term tree pattern t by replacing variables in t with rooted ordered trees. We show that Matching problem for term tree patterns in $\ott$ is NP-complete, by giving a reduction from the string pattern matching problem, which is NP-complete. Next, by giving operations on an interval, which is a set containing all integers between two given integers representing vertex identifiers, we propose an efficient algorithm for solving Matching problem for term tree patterns in $\lott\subsetneq \ott$. Then, we show that, when an ordered tree having N vertices and a term tree pattern $t \in \lott$ having n vertices are given, the proposed matching algorithm correctly solves this problem in O(nN) time.