As semi-structured data is used widely in several fields, the importance of structured data mining is increasing recently. Although mining frequent patterns in structured data is one of the most fundamental tasks, frequent pattern miners often discover huge number of patterns. To overcome this problem, two major approaches, condensed representation mining and constraint-based mining, have been proposed. In this paper, as a technique for integrating these two approaches, we propose three algorithms, RCLOCOT, posCLOCOT, and negCLOCOT, for discovering closed ordered subtrees under anti-monotone constraints about the structure of patterns to be discovered. The proposed algorithms discover closed constrained subtrees efficiently not by post-processing but by pruning and skipping the search space based on the occurrence matching and the patterns on the border.
In this paper, a logic program is considered as the union of a set of rules and an axiom which defines basic predicates and functions, and a procedure to calculate logic programs is considered as the combination of a replacement procedure with the bodies of the rules and a proof procedure with respect to the axiom. It is proved that a goal is a logical consequence of Δ ∪ Γ if and only if there exists n such that the logical formula obtained by replacing atomic formulae in the goal n times is a logical consequence of Δ, where Γ is the set of the rules, and Δ is the axiom. Moreover, conditions concerning Γ and Δ are described.
In this paper, we propose a class of algorithms for detecting the change-points in time-series data based on subspace identification, which is originaly a geometric approach for estimating linear state-space models generating time-series data. Our algorithms are derived from the principle that the subspace spanned by the columns of an observability matrix and the one spanned by the subsequences of time-series data are approximately equivalent. In this paper, we derive a batch-type algorithm applicable to ordinary time-series data, i.e., consisting of only output series, and then introduce the online version of the algorithm and the extension to be available with input-output time-series data. We illustrate the superior performance of our algorithms with comparative experiments using artificial and real datasets.