An Infrastructure for Comparative Genomics to Functionally Characterize Genes and Proteins

Clemens Suter-Crazzolara; Gunther Kurapkat

doi:10.11234/gi1990.11.24

Abstract

Current genome projects are resulting in a flood of sequence data. The interpretation of these sequences is lagging, and optimized data analysis strategies need to be developed. Much can be learned from comparing different genomes, as genomes of distant organisms may still encode proteins with high sequence similarity. The order of genes (co linearity) in genomes may also be conserved to some extend.
We have employed both these observations to create a multi-functional, computational analysis system (genomeSCOUT^tm), which allows for rapid identification and functional characterization of genes and proteins through genome comparison. With a number of independent algorithms, information about different levels of protein homology (concerning e.g. paralogs, orthologs and clusters of orthologous groups, COGs) and gene order is collected and stored in several value added databases. These databases are then used for interactive comparison of genomes and subsequent analysis. The application is based on the well established data integration system SRS. This ensures (1) fast handling of large genomic data sets, (2) straightforward access to a multitude of biological databases, (3) unique linking functions between these databases, (4) highly efficient collection of information on genes and proteins, and 5. fully integrated and user friendly graphical representations of search results.
This application can be used for projects as diverse as the correct annotation of genomes, the optimization of (micro) organisms for industrial production, or the identification of drug targets [22].

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!