A Partial-tree-based Approach for XPath Query on Large XML Trees

Wei Hao; Kiminori Matsuzaki

doi:10.2197/ipsjjip.24.425

抄録

XML is a popular data definition language and is widely used for representation of arbitrary data structures. For queries on XML documents, XPath has commonly been used in many applications. The complexity of applying queries increases as the number of nodes in an XML document increases. Querying very large XML documents becomes really difficult when there is not enough computer memory to store and manipulate the whole tree data. The objective of this study is to develop an algorithm for querying very large XML trees in a distributed-memory environment. We split a large XML document into small chunks and parse the chunks to create special trees called partial trees. Then the query is executed in parallel on the partial trees. The results from the partial trees are concatenated to form the final query results for output. The algorithms were tested on a 16-node PC cluster, and the experiment results showed a speedup of a factor of 6 on 16 nodes.

著者関連情報

お気に入り & アラート

閲覧履歴

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）