Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
Volume 8
Displaying 1-50 of 93 articles from this issue
  • Vlado Dancik, Michael S. Waterman
    1997Volume 8 Pages 1-8
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Recently a new method for obtaining restriction maps was developed by David Schwartz at NYU. Using this method restriction maps are created from fluorescent images of individual molecules obtained using a microscope. For every individual observed molecule, image processing methods are used to generate a list of the approximate locations of the sites where the molecule is cut by the restriction enzyme. Our task is to find the location of all restriction sites given the observed cutting sites. This is also complicated by the fact that an orientation of the molecules is unknown, i.e. for a cut-site x we do not know whether x or 1-x corresponds to a restriction site in a unit length molecule.
    First we consider the case that the orientation of all molecules and the number c of restriction sites are known. We suppose that for each restriction site location yi the corresponding measured cut-sites follow the normal distribution with the density function g (x;θj, σj) for some σj.(This means the measurement is unbiased with mean θj.) The observed cut-sites locations xi, …, xn then follow the mixture distribution f (x; p, θ, σ) =Σkj1 pig (x;θj, σj), where σ pj=1. Using the likelihood principle we wish to find parameters p, θ, σ that achieve the maximum of the likelihood function ∏ni=1f (xi; p, θ, σ). In our case it is natural to assume that p1 =…=pk=1/k and σ1=…=σk =…for a constant σ.
    Frequently in the Optical Mapping there appear “false” cuts, i.e. cuts corresponding to no restriction site. In our model we accommodate false cuts by using an uniform component in the mixture distribution. We use EM algorithm and Bayes theorem for computing the maximum likelihood estimate and compare our results for the different variants of our model.
    We explore how the change of the orientation of some molecules influences the maximum likelihood estimate and show that the orientation question can be in our case answered for each molecule separately. Finally we present few ideas for specifying the orientation of molecules without investigating the positions of restriction sites.
    Download PDF (919K)
  • Piotr P. Slonimski, M.O. Mosse, P. Golik, A. Henaut, J.L. Risler, J.P. ...
    1997Volume 8 Pages 9-10
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (203K)
  • Jérôme Chailloux, S. A. GENSET
    1997Volume 8 Pages 11
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (15K)
  • Antoine Danchin, Alain Hénaut
    1997Volume 8 Pages 13-14
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (177K)
  • Kiyoshi Asai, Yutaka Ueno, Katunobu Itou, Tetsushi Yada
    1997Volume 8 Pages 15-24
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    In this paper, we propose a new approach for gene recognition, which uses no training data for the recognizer. In this approach, we start from a simple model, which only uses the knowledge of start codons and the stop codons, then the recognition of the DNA sequences by the recognizer and the training of the parameters of the recognizer by the result of the recognition are repeated. We applied this parse and train approach to the complete genome sequence of cyanobacterium, and achieved the almost same recognition rate with the case of using the whole sequence as training data. This results open the possibility to use automatic gene annotation system inthe early stage of sequencing projects.
    Download PDF (1008K)
  • Mathieu Blanchette, Guillaume Bourque, David Sankoff
    1997Volume 8 Pages 25-34
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We describe a number of heuristics for inferring the gene orders of the hypothetical ancestral genomes in a fixed phylogeny. The optimization criterion is the minimum number of breakpoints (pairs of genes adjacent in one genome but not the other) in the gene orders of two genomes connected by an edge of the tree, summed over all edges. The key to the method is an exact solution for trees with three leaves (the median problem) based on a reduction to the Traveling Salesman Problem.
    Download PDF (796K)
  • Jacek Blazewicz, Piotr Formanowicz, Marta Kasprzak, Wojciech T. Markie ...
    1997Volume 8 Pages 35-42
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    The paper is concerned with a computational phase of the sequencing DNA chains by hybridization. It is assumed that positive faults can occur in the hybridization experiment. An approach based on a reduction of the problem to a variant of a Selective Traveling Salesman Problem and an algorithm for solving the latter, have been proposed. The algorithm behaves extremely well, even for a fault rate exceeding 50%.
    Download PDF (956K)
  • Koichiro Doi, Hiroshi Imai
    1997Volume 8 Pages 43-52
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Selecting a good collection of primers is very important for polymerase chain reaction (PCR) experiments. Most existing algorithms for primer selection are concerned with computing a primer pair for each DNA sequence. In generalizing the arbitrarily primed PCR, etc., to the case that all DNA sequences of target objects are already known, like about 6000 ORFs of yeast, we may design a small set of primers so that all the targets are PCR amplified and resolved electrophoretically in a series of experiments. This is quite useful because deceasing the number of primers greatly reduces the cost of experiments. Pearson et al.[7, 8] consider finding a minimum set of primers covering all given DNA sequences, but their method does not meet necessary biological conditions such as primer amplification and electrophoresis resolution.
    In this paper, based on the modeling and computational complexity analysis by Doi [2], we propose algorithms for this primer selection problem. These algorithms do not necessarily minimize the number of primers, but, since basic versions of these problems are shown to be computationally intractable, especially even for approximability with the length resolution condition, this is inevitable. In the algorithms, the amplification condition by a primer pair and the length resolution condition by electrophoresis are incorporated. These algorithms are based on the theoretically well-founded greedy algorithm for the set cover in computer science. Preliminary computational results are presented to show the validity of this approach. The number of computed primers is much less than a half of the number of targets, and hence is less than one forth of the number needed in the multiplex PCR.
    Download PDF (1149K)
  • Yukiko Fujiwara, Minoru Asogawa, Kenta Nakai
    1997Volume 8 Pages 53-60
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    The mitochondrial targeting signal (MTS) is the presequence that directs nascent proteins bearing it to mitochondria. We have developed a hidden Markov model (HMM) that represents various known sequence characteristics of MTSs, such as the length variation, amino acid composition, amphiphilicity, and consensus pattern around the cleavage site. The topology and parameters of this model are automatically determined by the iterative duplication method, in which a small fullyconnected HMM is gradually expanded by state splitting. The model can be used to predict the existence of MTSs for given amino acid sequences. Its prediction accuracy was estimated to be 86.9% using the cross validation test. Furthermore, a higher correlation was observed between the HMM score and the in vitro ATPase activity of MSF, which can be regarded as an experimental measure of signal strength, for various synthetic peptides than was observed with other methods.
    Download PDF (893K)
  • Hideki Hirakawa, Satoru Kuhara
    1997Volume 8 Pages 61-70
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Information concerning the secondary structures, flexibility, epitope and hydrophobic regions of amino acid sequences can be extracted by assigning physicochemical indices to each amino acid residue, and information on structure can be derived using the sliding window averaging technique, which is in wide use for smoothing out raw functions. Wavelet analysis has shown great potential and applicability in many fields, such as astronomy, radar, earthquake prediction, and signal or image processing. This approach is efficient for removing noise from various functions. Here we employed wavelet analysis to smooth out a plot assigned to a hydrophobicity index for amino acid sequences. We then used the resulting function to predict hydrophobic cores in globular proteins. We calculated the prediction accuracy for the hydrophobic cores of 88 representative set of proteins. Use of wavelet analysis made feasible the prediction of hydrophobic cores at 6.13% greater accuracy than the sliding window averaging technique.
    Download PDF (1217K)
  • Hiroaki Inayoshi, Hitoshi Iba
    1997Volume 8 Pages 71-79
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper presents a new computational method in the modeling and simulation of gene expression by introducing the artificial chemical system. The artificial chemical system is specified by its four items:(1) components (five kinds of particles and DNA with Genetic Switches);(2) space (2-dimensional polar grids);(3) simple reaction rules (construction and destruction of molecules, etc.);(4) simple behavioral rules (stochastic movements and stochastic collisions, etc.). The simulation demonstrates the capability of the system to exhibit emergent behavior: that is, global order of the system (regular rhythms, i.e. regular oscillations in the amounts of some gene products, in this case) emerges out of the randomness (through stochastic movements and collisions) of the components.
    Download PDF (673K)
  • Jeffrey M. Koshi, David P. Mindell, Richard A. Goldstein
    1997Volume 8 Pages 80-89
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We describe a model for characterizing site mutations in evolving proteins. By representing the fitness of each of the amino acids as a function of the physical-chemical properties of that amino acid, and constructing mutation matrices based on Boltzmann statistics and Metropolis kinetics, we are able to greatly reduce the number of adjustable parameters. This allows us to include site heterogeneity in the model, as well as to optimize the model for specific protein types. We demonstrate the applicability of the model by investigating the phylogenetic relationship between various subtypes of HIV-1.
    Download PDF (1291K)
  • Antje Krause, Martin Vingron
    1997Volume 8 Pages 90-99
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    An iterative database searching method is introduced and applied to the design of a database clustering procedure. The search method virtually never produces false positive hits while determining meaningfully large sets of sequences related to the query. A novel set-theoretic database clustering algorithm exploits this feature and avoids a traditional, distance-based clustering step. This makes it fast and applicable to data-sets of the size of, e. g., the Swiss-Prot database. In practice we achieve unambiguous assignment of 80% of Swiss-Prot sequences to non-overlapping sequence clusters in an entirely automatic fashion.
    Download PDF (1342K)
  • Nobutaka Mitsuhashi, Haretsugu Hishigaki, Toshihisa Takagi
    1997Volume 8 Pages 100-109
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Knowledge discovery in large databases (KDD) is being performed in several application domains, for example, t he analysis of sales data, and is expectedt o be appliedt o other domains. We propose a KDD approach to multipoint linkage analysis, which is a way of ordering loci on a chromosome. S trict multipointl inkagea nalysis basedo n maximuml ikelihoode stimationi s a computationally tough problem. So far various kinds of approximate methods have been implemented. Our method based on the discoveryo f associationb etweeng enetic recombinationsis so different from others that it is useful to recheck the result of them. In this paper, we describe how to apply thef rameworko f associationr ule discoveryt o linkagea nalysis, and also discusst hat filteringi nput data and interpretation of discoveredr ules after data mining are practicallyi mportant as well as data mining process itself.
    Download PDF (1061K)
  • Pedro Romero, Zoran Obradovic, A.Keith Dunker
    1997Volume 8 Pages 110-124
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Our recently reported results [14, 29, 30] provide strong support for a hypothesis that some aminoacid sequences code for disordered regions rather than structured ones and that such disordered regions are commonly involved in function. General and family-specific neural network predictors developed in those previous studies suggest that different classes of disordered regions exist. Here, family-specific data preprocessing for disorder prediction in the calcineurin (CaN) family is explored. The results show that prediction of order and disorder on CaN sequence data benefits significantly from the use of family-specific preprocessing, with feature extraction through principal components analysis (PCA) outperforming feature selection techniques, although all methods do a good job of discriminating CaN-specific disordered regions from CaN-specific ordered regions. On the other hand, for the discrimination of CaN-specific disordered regions from general (unrelated to CaN) ordered regions, feature selection approaches proved to be more appropriate than PCA. The results further support a hypothesis that different kinds of disordered regions exist, as all family-specific disorder predictors developed in this study significantly outperformed a previously reported general multi family disorder predictor.
    Download PDF (1915K)
  • Sivasundaram Suharnan, Takeshi Itoh, Hideo Matsuda, Hirotada Mori
    1997Volume 8 Pages 125-134
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Motivation: Several methods in genetic information have recently been developed to estimate classification of protein sequences through their sequence similarity. These methods are essential for understanding the function of predicted open reading frames (ORFs) and their molecular evolutionary processes. However, since many protein sequences consist of a number of independently evolved structural units (we refer to these units as components), the combinatorial nature of the components makes it difficult to classify the sequences.
    Results: This paper presents a new method for classifying uncharacterized protein sequences. As the measure of sequence similarity, we use similarity score computed by a method based on the Smith-Waterman local alignment algorithm. Here we introduce how this method cope when sequences have multi-component structure. This method was applied to predicted ORFs on the Escherichia coli genome and we discuss the algorithm and experimental results.
    Download PDF (1043K)
  • Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Watanabe
    1997Volume 8 Pages 135-146
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed a powerful image processing system DNAinsight, which performs automated detection of several thousands of spots found on autoradiogram images obtained with 2-D gel electrophoresis of genomic DNA. Algorithms and parameters for detecting spot locations and intensities are carefully chosen so as to enable reliable and rapid processing of 2-D gel electrophoretograms based on the RLGS (restriction landmark genomic scanning) method. In DNAinsight, matching of several related spot patterns, such as those from tumor-cell and normal-cell, can be accomplished rapidly with easy operations, being solved by comparing the Delaunay net and relative neighborhood graph. The automated and accurate image processing system strongly supports the rapid identification and analysis of genetic variation in the DNA of humans and other animals.
    Download PDF (2824K)
  • Masaru Tomita, Tom Shimizu, Kanako Saito, J. Craig Venter, Kenta Hashi ...
    1997Volume 8 Pages 147-155
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We present E-CELL, a generic computer software environment for modeling a cell and conducting experiments in silico. The E-CELL system allows a user to define functions of proteins, protein-protein interactions, protein-DNA interactions, regulation of gene expression and other features of cellular metabolism, in terms of a set of reaction rules. The system then executes those reactions iteratively, and the user can observe, through a computer display, dynamic changes in concentrations of proteins, protein complexes and other chemical compounds in the cell.
    Using this software, we constructed a model of a hypothetical cell with only 127 genes sufficient for transcription, translation, energy production and phospholipid synthesis. Most of the genes are taken from Mycoplasma genitalium, the organism having the smallest known chromosome, whose complete 580kb genome sequence was determined at TIGR in 1995.
    We discuss future applications of the E-CELL system with special respect to genome engineering.
    Download PDF (1313K)
  • Laurence Vignal, Frédérique Lisacek
    1997Volume 8 Pages 156-165
    Published: 1997
    Released on J-STAGE: November 16, 2011
    JOURNAL FREE ACCESS
    Given the problem of identifying exons in new genomic DNA, the sketch of a resolution process was drawn using sequence data and models of site/signal recognition. A multi-agent architecture is used to validate these models and test hypotheses on the chronology of events involved in gene splicing. Information is channelled through a hierarchy of agents. Each type of agent is the result of a successful step in the resolution process. The system does not rely on the compositional bias of coding sequences which is a key feature of current computer methods.
    Download PDF (1244K)
  • Frédéric Achard, Emmanuel Barillot, Gis Infobiogen
    1997Volume 8 Pages 166-172
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper focuses on a specific type of information frequently used by researchers in Genetics: links between genome objects. It emphasizes the fact that, at present, links are not sufficiently characterized and describes our work to address this problem: the design of a prototype databank to store links between genome databases. Because this global repository is of concern for many people, we welcome and encourage feedback from the community.
    Download PDF (848K)
  • Tatsuya Akutsu
    1997Volume 8 Pages 173-179
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper describes simple DP (dynamic programming) algorithms for RNA secondary structure prediction with pseudoknots, for which no explicit DP algorithm had been known. Results of preliminary computational experiments are described too.
    Download PDF (695K)
  • Tatsuya Akutsu
    1997Volume 8 Pages 180-186
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper shows that the problem of finding a protein side-chain packing is computationally hard (NP-hard), where the problem is defined here as a combinatorial search problem using rotamer library. Although this result does not suggest a new method, it gives a justification for previous methods using such heuristics as simulated annealing, neural networks, genetic algorithms, and Gibbs sampling.
    Download PDF (761K)
  • Winston Hide, John Burke, Alan Christoffels, Robert Miller
    1997Volume 8 Pages 187-196
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    In order to provided a novel maximised approach to the generation of accurate, comprehensive, consensus sequences of the expressed human genome, we have developed and produced a system for a novel-representation, broad gene coverage, consensus database of expressed human gene fragments (ESTs). To perform clustering of ESTs, we have developed and employed D2-cluster, an algorithm based on the d2-search algorithm (Hide et al. 1994) specifically for EST clustering. D2-cluster does not require alignment in order to perform clustering (Burke, Davison and Hide, in prep). We have incorporated d2-cluster into a portable and novel system to perform clustering, alignment and automated error analysis of publicly available expressed sequence tags (STACKIPACK). The system includes a statistically robust algorithm that can detect and compensate for error within an aligned cluster of ESTs. We have manufactured a database of partial human consensus sequences from 552 013 ESTs from dbEST 040896 and TIGR. The database is termed Sequence Tag Alignment and Consensus Knowledgebase (STACK). STACK 1.0 contains 18 divisions based on tissue annotation identifying 204 431 unique sequences and generating 76 131 consensi which represent 321 134 ESTs. The consensus sequences have an average length of 497 bases, a 39% increase over the 357 base average length of the input data set. Clone Ids are used to join 92 759 unique sequences and 48 858 consensi into 61 632 linked sequences, averaging 900 bases each. The distribution of clusters compares favourably with UniGene, reflecting the difference in methodology of clustering and the higher input number of sequences into STACK. SANIGENE high accuracy database is also generated, consisting of sequences which agree in at least two ESTs. STACK is a distributable, core information resource upon which a comprehensive knowledgebase can be built.
    Download PDF (1146K)
  • Makoto Hirosawa, Katsumi Isono
    1997Volume 8 Pages 197-206
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Previously, we developed a GeneMark-based procedure, termed GeneMark-RC, and applied it for the identification and classification of ORFs in genomic sequence data, and identified and characterized ORFs in the 1.0 Mb data of the cyanobacterium Synechocystis sp. strain PCC 6803. In the present study, we have improved the procedure and performed analysis of the whole genomic data of Synechocystis. Consequently, we noticed the presence of three distinct classes of ORFs in this organism. The prediction of ORFs by the class-specific GeneMark-RC analysis agreed with 97.9% of those described for this bacterium. Moreover, 124 additional ORFs were identified. The procedure was similarly applied to the genomic analysis of five other prokaryotes, and 2 to 3 classes of ORFs were recognized in each case. Common features were found among the ORFs identified in the six organisms including Synechocystis. Class 1 is composed of most typical ORFs whose GC content is slightly higher than the average, while Class 2 is composed of ORFs with GC contents lower than the average. It was found that ORFs of one species can be detected with the GeneMark-RC parameters obtained from other organisms, and the prediction rate is high when the difference in their GC contents is small. It was also found that ORFs of three species with relatively low GC contents can be nicely detected with the Synechocystis matrices of Class 2 ORFs whose GC content is similar to that of the three species. Therefore, although there are two to three classes of ORFs in each species, their di-codon statistics must be rather similar to each other if their GC contents are similar. A notable exception was the case of Methanococcus jannaschii, which might reflect the fact that it is an archaebacterium.
    Download PDF (1346K)
  • Andreas M. Kogelnik, Shamkant B. Navathe, Douglas C. Wallace
    1997Volume 8 Pages 207-214
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed the Georgia Tech Emory Networked Object Management Environment (GENOME). GENOME is a prototype database management system (DBMS)/user interface system designed to manage complex biological data, allowing users to more fully analyze and understand relationships in human genome data. The system is designed to allow the establishment of a network of searchable data sources. The DBMS portion of the environment is a hybrid object-relational system which interprets its data structures on-the-fly, resulting in an extremely flexible DBMS. Such a DBMS provides an environment for interrelating distributed data items, allowing users to further explore computational questions in biomedical science in addition to other fields by maximizing access to data. In developing GENOME, we used MITOMAP, a human mitochondrial genome database, as a model genomic database. MITOMAP encompasses one of the most complete collections of genomic data available for a specific locus or chromosome, including functional, population variation, disease mutation, and gene-gene interaction data, as well as complete sequence data for the human mitochondrial chromosome, and thus serves as an excellent model system. An effective DBMS is required for handling the plethora of Human Genome Project data to handle the various locus-specific databases and ultimately to unify all human genetic and biomedical information through the complete human genome sequence. Developing such a DBMS is our goal. We expect that GENOME will be generally applicable to other biological and non biological paradigms as well.
    Download PDF (1249K)
  • Éric Rivals, Jean-Paul Delahaye, Max Dauchet, Olivier Delgrange
    1997Volume 8 Pages 215-226
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Long direct repeats in genomes arise from molecular duplication mechanisms like retrotransposition, copy of genes, exon shuffling, ... Their study in a given sequence reveals its internal repeat structure as well as part of its evolutionary history. Moreover, detailed knowledge about the mechanisms can be gained from a systematic investigation of repeats. The problem of finding such repeats is viewed as an NP-complete problem of the optimal compression of a sequence thanks to the encoding of its exact repeats. The repeats chosen for compression must not overlap each other as do the repeats which result from molecular duplications. We present a new heuristic algorithm, Search_Repeats, where the selection of exact repeats is guided by two biologically sound criteria: their length and the absence of overlap between those repeats. Search_Repeats detects approximate repeats, as clusters of exact sub-repeats, and points out large insertions/deletions in them. Search_Repeats takes only 3 seconds of CPU time for the genome of Haemophilus influenzae on a Sun Ultrasparc workstation.
    Download PDF (1416K)
  • Kiyotaka Shiba, Hiromi Motegi, Tetsuo Noda
    1997Volume 8 Pages 227-233
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Aminoacyl-tRNA synthetases (ARSs) are believed to have arisen early in the evolution of life as the essential components that establish the link between triplet codons and amino acids. We have cloned and sequenced eight cDNAs for human cytoplasmic ARSs. Along with twelve sequences that have been reported from other laboratories, a set of 20 human cytoplasmic ARS genes is now available. We compared these human ARSs with -400 sequences of ARS currently available from various organisms and deduced the possible evolutionary history of these enzymes. The availability of complete sets of ARSs from thirteen organisms (H. sapiens, S. cerevisiae, E. coli, H. influenzae, H. pylori, N. gonorrhoeae, S. pyogenes, M genitalium, M. pneumoniae, Synechocystis sp., M jannaschii, M. thermoautotrophicum, and A. fulgidus) made systematic analyses of the evolution of this gene family possible. In this paper, we will focus on two topics;(1) the acquisition of new structural domains to the core enzyme domains in higher eukaryotes and their possible role in the formation of multi-synthetase supra-molecular complexes, and (2) the existence of eukaryotic-like ARSs in some bacterial genomes, and the relationship of this occurrence to tRNA recognition.
    Download PDF (667K)
  • Jianghong An, Yasushi Kubota, Takao Nakama, Akinori Sarai
    1997Volume 8 Pages 234-235
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have created an integrated database, search and visualization tool, named 3DinSight, to help researchers to get insight into the relationship of structure, function and property of biomolecules. Various kinds of searches can be carried out though WWW interfaces. The locations of motif sequences and mutations are automatically mapped on the structure, and visualized in 3D space by interactive viewers, VRML (Virtual Reality Modeling Language) and RasMol. In the case of VRML, the mapped 3D objects are hyper-linked to the corresponding document data. The amino-acid properties of structure, functional and mutation sites, can be displayed as graph plots. 3DinSight is freely accessible through the URL http://www.rtc.riken.go.jp/3DinSight.html.
    Download PDF (286K)
  • Genome Information Broker and the Enhancement of SAKURA
    Kousuke Goto, Toshitsugu Okayama, Hirotada Mori, Hikaru Yamamoto, Tomo ...
    1997Volume 8 Pages 236-237
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Data submitters, reviewers and users of DNA Data Bank of Japan (DDBJ) processes sequence data longer than 1M base pairs thanks to genome projects. In order to realize smooth and reliable submission, annotation and dissemination of the large scale genetic information, DDBJ developed systems which visualize sequences and relevant biological information. A newly developed data dissemination system named Genome Information Broker and the enhancement of Web data submission system SAKURA are introduced here from the viewpoint of visualization.
    Download PDF (218K)
  • Yasuhiko Kitamura, Tetsuya Nozaki, Shoji Tatsumi, Akira Tanigami
    1997Volume 8 Pages 238-239
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    MetaCommander is developed as a generic tool to retrieve and integrate information from WWW servers by interpreting a script. By using MetaCommander, we can support genome information processing on WWW browsers in various ways.
    Download PDF (210K)
  • Toward Software Agent for Genome Information Analysis
    Hiroshi Matsuno, Manabu Hori, Nobuaki Wada, Miyako Tanaka
    1997Volume 8 Pages 240-241
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (179K)
  • Hiroshi Matsuno, Misako Ichimura, Tatsumi Fukuyama, Miyako Tanaka
    1997Volume 8 Pages 242-243
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (214K)
  • S. Minoshima, S. Mitsuyama, S. Ohno, T. Kawamura, N. Shimizu
    1997Volume 8 Pages 244-245
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (559K)
  • Tadashi Mizunuma, Sadahiko Misu, Motonori Ota, Ken Nishikawa
    1997Volume 8 Pages 246-247
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (776K)
  • Kotoko Nakata, Takako Igarashi, Tsuguchika Kaminuma
    1997Volume 8 Pages 248-249
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    A database for receptors on cell membrane has been developed. The system can collect data items such as attributes of proteins from distributed data sources on the Internet. Such sources include internationally standard biological databases such as the updated genetic database of PIR, Swiss Prot, PDB, GenBank, EMBL and GDB. The system provides various viewing tools that effectively displays different types of receptor data; DNA sequences, amino acids sequences, DNA binding sites, ligand binding sites, gene and disease information, and the protein structural information. It can also display three dimensional images using a freeware program RASMOL. DNA binding sites, ligand binding sites and active sites are classified by coloring the sequences. PDB matching sites are classified by italicization. CSNDB (Cell Signaling Networks Database), which is a database for cellular signal transduction of human is also linked in the system. The database may be useful for quick reference for ligand-membrane receptors and signal transduction in the drug design.
    Download PDF (355K)
  • Tadasu Shin-i, Yuji Kohara
    1997Volume 8 Pages 250-251
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed a WWW-based database, named NEXTDB, to integrate all the information of ESTs (tag-sequences of cDNA clones) and gene expression patterns of C. elegans which are being produced and analyzed in this laboratory. NEXTDB incorporates and processes raw data of tag sequencing and classifies them into unique cDNA groups by comparing the 3'-tags. The database contains the information on map position of the cDNA groups, correspondence to predicted CDSs and homologies to other organisms' genes. NEXTDB incorporates image data of in situ hybridization which show the expression patterns of individual cDNA groups and provides us a platform for annotation of the images. The database also contains the cosmid contig maps obtained from AceDB. All of the information are linked each other in NEXTDB, which can be accessed through the internet.
    Download PDF (297K)
  • DNAinsight
    Takayuki Toda, Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Watanabe
    1997Volume 8 Pages 252-253
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (320K)
  • Tatsuya Akutsu, Satoru Kuhara, Osamu Maruyama, Satoru Miyano
    1997Volume 8 Pages 254-255
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (228K)
  • Tatsuya Akutsu, Akira Ohyama, Kyotetsu Kanaya, Asao Fujiyama
    1997Volume 8 Pages 256-257
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have been developing an image analysis system named DDGEL for 2D gel electrophoresis of genomic DNA. Recently, we have developed a program module for finding a correspondence of spots between two gel electrophoresis images.
    Download PDF (561K)
  • Tatsuya Akutsu, Hiroshi Tashimo
    1997Volume 8 Pages 258-259
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have been developing a novel method of deriving a score function for protein threading. In this method, the constraint that the score of the native threading is minimum over all possible threadings is expressed in a form of linear inequalities, and then parameters defining a score function are determined by solving these inequalities. The proposed method was evaluated using Lathrop and Smith's algorithm for finding optimal threadings and was shown to be effective for computing nearly correct threadings.
    Download PDF (230K)
  • Hidemasa Bono, Susumu Goto, Hiroyuki Ogata, Minoru Kanehisa
    1997Volume 8 Pages 260-261
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Predicting gene functions from the whole genome sequence is an important problem in a postgenome era. We are developing a function predicting system from the whole genome sequence utilizing the functionally well annotated genome as a reference organism for the knowledge of biologically well known pathways. The databases of gene catalogs and pathways are compiled under the KEGG project. In this paper we show an instance for identifying functions of genes involved in the two-component signal transduction system.
    Download PDF (237K)
  • Hiroki Fukasawa, Shigehiko Kanaya, Yoshihiro Kudo
    1997Volume 8 Pages 262-263
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (203K)
  • Takamasa Futatsuki, Yuichi Kawanishi, Kimitoshi Naito, Satoru Miyazaki ...
    1997Volume 8 Pages 264-265
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We analyzed and implemented Smith and Waterman algorithm and maximum likelihood method into the vector-parallel computer of Fujitsu VPP500. The programs optimized for the computer are ssearch, clustalw and fastDNAml. Our goal is to develop a total system which will cover all processes from database search to the construction of large scale phylogenetic trees on super-computer.
    Download PDF (161K)
  • Osamu Gotoh
    1997Volume 8 Pages 266-267
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (210K)
  • Qian-Ping Gu, Kazuyuki Iwata, Shietung Peng, Qi-Ming Chen
    1997Volume 8 Pages 268-269
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (204K)
  • Shugo Hamahashi, Hiroaki Kitano
    1997Volume 8 Pages 270-271
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Embryogenesis is one of the most important and mysterious process of animal's development. The embryogenesis is quite complex and hard to be understood because it has too many elements, such as cells or nuclei, which interact with each other. We replicated the system of Drosophila's early segmentation by using computer. Computer simulation enables us to understand a whole system of animal's development. The work reported here is an attempt to observe the mechanism of segmentation during the early development of Drosophila in detail by using computer simulation, which is a part of Virtual Drosophila project.
    Download PDF (225K)
  • Yoshitomo Harada, Masato Wayama, Toshio Shimizu
    1997Volume 8 Pages 272-273
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (196K)
  • Junichi Isamikawa, Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Wata ...
    1997Volume 8 Pages 274-275
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (629K)
  • Masahiro Hattori, Minoru Kanehisa
    1997Volume 8 Pages 276-277
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    For the purpose of an analysis of apoptotic molecular interactions, we have developed the knowledge base on apoptosis, which consists of molecular interactions concerning apoptosis reported in experimental papers. We have collected about 80 entries, where one entry is corresponding to one molecule, and each entry contains their interaction information.
    Download PDF (194K)
  • Mika Hirakawa, Kensaku Imai, Hiroko Yamaguchi, Junko Shimada, Kazuo Ta ...
    1997Volume 8 Pages 278-279
    Published: 1997
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    The goal of the Advanced Life Science Information Systems (ALIS) project is construction of an entire human genome database that will provide an efficient source of information for researchers after the human genome has been sequenced. We have initiated this project to encourage large scale human genome sequencing and to develop systems for genome data management and data publishing by World Wide Web. It has been 2 years since the project began and our first attempt at human genome sequencing is going well and more than 4M bases of well-edited human genome sequences have been acquired. The human genome project is progressing and international consensus releasing data generated from the project has been defined. We have been improved on our sequencing database to adapt the situation. Recently we organized collection and publication system for the genome sequencing data.
    Download PDF (164K)
feedback
Top