Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
Making High-level Queries on Diverse Genome Data: A Structured Genome Document Database System Based on GXML and GQL
Aaron StokesHideo MatsudaAkihiro Hashimoto
Author information
JOURNAL FREE ACCESS

1999 Volume 10 Pages 176-185

Details
Abstract

Complete DNA sequences (genomes) and associated data are being made available worldwide at an astonishing rate. Through computer analysis of such data, molecular biologists hope to gain an overall understanding of the genome, such as by predicting large-scale gene networks. However, this is difficult because diverse genome data are scattered across many highly heterogeneous databases, and because existing database systems lack the facilities to expose and analyze functional relationships among the data. To address these problems, we propose a new type of genome database system. Since a genome can be thought of intuitively as a kind of ‘document’, our system uses a structured document language based on XML to effectively represent genomes and associated data. The information-rich structures of the genome documents help cope with data diversity and heterogeneity. A powerful query language is introduced that exposes important biological relationships among the genome data. We have obtained favorable results from several experiments, demonstrating the usefulness of our method in building a top-down view of genome functionality.

Content from these authors
© Japanese Society for Bioinformatics
Previous article Next article
feedback
Top