-
Vlado Dancik, Michael S. Waterman
1997Volume 8 Pages
1-8
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Recently a new method for obtaining restriction maps was developed by David Schwartz at NYU. Using this method restriction maps are created from fluorescent images of individual molecules obtained using a microscope. For every individual observed molecule, image processing methods are used to generate a list of the approximate locations of the sites where the molecule is cut by the restriction enzyme. Our task is to find the location of all restriction sites given the observed cutting sites. This is also complicated by the fact that an orientation of the molecules is unknown, i.e. for a cut-site x we do not know whether x or 1-x corresponds to a restriction site in a unit length molecule.
First we consider the case that the orientation of all molecules and the number c of restriction sites are known. We suppose that for each restriction site location yi the corresponding measured cut-sites follow the normal distribution with the density function g (x;θ
j, σ
j) for some σ
j.(This means the measurement is unbiased with mean θ
j.) The observed cut-sites locations xi, …, xn then follow the mixture distribution f (x; p, θ, σ) =Σ
kj1 pig (x;θ
j, σ
j), where σ p
j=1. Using the likelihood principle we wish to find parameters p, θ, σ that achieve the maximum of the likelihood function ∏
ni=1f (xi; p, θ, σ). In our case it is natural to assume that p
1 =…=p
k=1/k and σ
1=…=σ
k =…for a constant σ.
Frequently in the Optical Mapping there appear “false” cuts, i.e. cuts corresponding to no restriction site. In our model we accommodate false cuts by using an uniform component in the mixture distribution. We use EM algorithm and Bayes theorem for computing the maximum likelihood estimate and compare our results for the different variants of our model.
We explore how the change of the orientation of some molecules influences the maximum likelihood estimate and show that the orientation question can be in our case answered for each molecule separately. Finally we present few ideas for specifying the orientation of molecules without investigating the positions of restriction sites.
View full abstract
-
Piotr P. Slonimski, M.O. Mosse, P. Golik, A. Henaut, J.L. Risler, J.P. ...
1997Volume 8 Pages
9-10
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Jérôme Chailloux, S. A. GENSET
1997Volume 8 Pages
11
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Antoine Danchin, Alain Hénaut
1997Volume 8 Pages
13-14
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Kiyoshi Asai, Yutaka Ueno, Katunobu Itou, Tetsushi Yada
1997Volume 8 Pages
15-24
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
In this paper, we propose a new approach for gene recognition, which uses no training data for the recognizer. In this approach, we start from a simple model, which only uses the knowledge of start codons and the stop codons, then the recognition of the DNA sequences by the recognizer and the training of the parameters of the recognizer by the result of the recognition are repeated. We applied this parse and train approach to the complete genome sequence of cyanobacterium, and achieved the almost same recognition rate with the case of using the whole sequence as training data. This results open the possibility to use automatic gene annotation system inthe early stage of sequencing projects.
View full abstract
-
Mathieu Blanchette, Guillaume Bourque, David Sankoff
1997Volume 8 Pages
25-34
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We describe a number of heuristics for inferring the gene orders of the hypothetical ancestral genomes in a fixed phylogeny. The optimization criterion is the minimum number of breakpoints (pairs of genes adjacent in one genome but not the other) in the gene orders of two genomes connected by an edge of the tree, summed over all edges. The key to the method is an exact solution for trees with three leaves (the median problem) based on a reduction to the Traveling Salesman Problem.
View full abstract
-
Jacek Blazewicz, Piotr Formanowicz, Marta Kasprzak, Wojciech T. Markie ...
1997Volume 8 Pages
35-42
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
The paper is concerned with a computational phase of the sequencing DNA chains by hybridization. It is assumed that positive faults can occur in the hybridization experiment. An approach based on a reduction of the problem to a variant of a Selective Traveling Salesman Problem and an algorithm for solving the latter, have been proposed. The algorithm behaves extremely well, even for a fault rate exceeding 50%.
View full abstract
-
Koichiro Doi, Hiroshi Imai
1997Volume 8 Pages
43-52
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Selecting a good collection of primers is very important for polymerase chain reaction (PCR) experiments. Most existing algorithms for primer selection are concerned with computing a primer pair for each DNA sequence. In generalizing the arbitrarily primed PCR, etc., to the case that all DNA sequences of target objects are already known, like about 6000 ORFs of yeast, we may design a small set of primers so that all the targets are PCR amplified and resolved electrophoretically in a series of experiments. This is quite useful because deceasing the number of primers greatly reduces the cost of experiments. Pearson et al.[7, 8] consider finding a minimum set of primers covering all given DNA sequences, but their method does not meet necessary biological conditions such as primer amplification and electrophoresis resolution.
In this paper, based on the modeling and computational complexity analysis by Doi [2], we propose algorithms for this primer selection problem. These algorithms do not necessarily minimize the number of primers, but, since basic versions of these problems are shown to be computationally intractable, especially even for approximability with the length resolution condition, this is inevitable. In the algorithms, the amplification condition by a primer pair and the length resolution condition by electrophoresis are incorporated. These algorithms are based on the theoretically well-founded greedy algorithm for the set cover in computer science. Preliminary computational results are presented to show the validity of this approach. The number of computed primers is much less than a half of the number of targets, and hence is less than one forth of the number needed in the multiplex PCR.
View full abstract
-
Yukiko Fujiwara, Minoru Asogawa, Kenta Nakai
1997Volume 8 Pages
53-60
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
The mitochondrial targeting signal (MTS) is the presequence that directs nascent proteins bearing it to mitochondria. We have developed a hidden Markov model (HMM) that represents various known sequence characteristics of MTSs, such as the length variation, amino acid composition, amphiphilicity, and consensus pattern around the cleavage site. The topology and parameters of this model are automatically determined by the iterative duplication method, in which a small fullyconnected HMM is gradually expanded by state splitting. The model can be used to predict the existence of MTSs for given amino acid sequences. Its prediction accuracy was estimated to be 86.9% using the cross validation test. Furthermore, a higher correlation was observed between the HMM score and the in vitro ATPase activity of MSF, which can be regarded as an experimental measure of signal strength, for various synthetic peptides than was observed with other methods.
View full abstract
-
Hideki Hirakawa, Satoru Kuhara
1997Volume 8 Pages
61-70
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Information concerning the secondary structures, flexibility, epitope and hydrophobic regions of amino acid sequences can be extracted by assigning physicochemical indices to each amino acid residue, and information on structure can be derived using the sliding window averaging technique, which is in wide use for smoothing out raw functions. Wavelet analysis has shown great potential and applicability in many fields, such as astronomy, radar, earthquake prediction, and signal or image processing. This approach is efficient for removing noise from various functions. Here we employed wavelet analysis to smooth out a plot assigned to a hydrophobicity index for amino acid sequences. We then used the resulting function to predict hydrophobic cores in globular proteins. We calculated the prediction accuracy for the hydrophobic cores of 88 representative set of proteins. Use of wavelet analysis made feasible the prediction of hydrophobic cores at 6.13% greater accuracy than the sliding window averaging technique.
View full abstract
-
Hiroaki Inayoshi, Hitoshi Iba
1997Volume 8 Pages
71-79
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper presents a new computational method in the modeling and simulation of gene expression by introducing the artificial chemical system. The artificial chemical system is specified by its four items:(1) components (five kinds of particles and DNA with Genetic Switches);(2) space (2-dimensional polar grids);(3) simple reaction rules (construction and destruction of molecules, etc.);(4) simple behavioral rules (stochastic movements and stochastic collisions, etc.). The simulation demonstrates the capability of the system to exhibit emergent behavior: that is, global order of the system (regular rhythms, i.e. regular oscillations in the amounts of some gene products, in this case) emerges out of the randomness (through stochastic movements and collisions) of the components.
View full abstract
-
Jeffrey M. Koshi, David P. Mindell, Richard A. Goldstein
1997Volume 8 Pages
80-89
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We describe a model for characterizing site mutations in evolving proteins. By representing the fitness of each of the amino acids as a function of the physical-chemical properties of that amino acid, and constructing mutation matrices based on Boltzmann statistics and Metropolis kinetics, we are able to greatly reduce the number of adjustable parameters. This allows us to include site heterogeneity in the model, as well as to optimize the model for specific protein types. We demonstrate the applicability of the model by investigating the phylogenetic relationship between various subtypes of HIV-1.
View full abstract
-
Antje Krause, Martin Vingron
1997Volume 8 Pages
90-99
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
An iterative database searching method is introduced and applied to the design of a database clustering procedure. The search method virtually never produces false positive hits while determining meaningfully large sets of sequences related to the query. A novel set-theoretic database clustering algorithm exploits this feature and avoids a traditional, distance-based clustering step. This makes it fast and applicable to data-sets of the size of, e. g., the Swiss-Prot database. In practice we achieve unambiguous assignment of 80% of Swiss-Prot sequences to non-overlapping sequence clusters in an entirely automatic fashion.
View full abstract
-
Nobutaka Mitsuhashi, Haretsugu Hishigaki, Toshihisa Takagi
1997Volume 8 Pages
100-109
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Knowledge discovery in large databases (KDD) is being performed in several application domains, for example, t he analysis of sales data, and is expectedt o be appliedt o other domains. We propose a KDD approach to multipoint linkage analysis, which is a way of ordering loci on a chromosome. S trict multipointl inkagea nalysis basedo n maximuml ikelihoode stimationi s a computationally tough problem. So far various kinds of approximate methods have been implemented. Our method based on the discoveryo f associationb etweeng enetic recombinationsis so different from others that it is useful to recheck the result of them. In this paper, we describe how to apply thef rameworko f associationr ule discoveryt o linkagea nalysis, and also discusst hat filteringi nput data and interpretation of discoveredr ules after data mining are practicallyi mportant as well as data mining process itself.
View full abstract
-
Pedro Romero, Zoran Obradovic, A.Keith Dunker
1997Volume 8 Pages
110-124
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Our recently reported results [14, 29, 30] provide strong support for a hypothesis that some aminoacid sequences code for disordered regions rather than structured ones and that such disordered regions are commonly involved in function. General and family-specific neural network predictors developed in those previous studies suggest that different classes of disordered regions exist. Here, family-specific data preprocessing for disorder prediction in the calcineurin (CaN) family is explored. The results show that prediction of order and disorder on CaN sequence data benefits significantly from the use of family-specific preprocessing, with feature extraction through principal components analysis (PCA) outperforming feature selection techniques, although all methods do a good job of discriminating CaN-specific disordered regions from CaN-specific ordered regions. On the other hand, for the discrimination of CaN-specific disordered regions from general (unrelated to CaN) ordered regions, feature selection approaches proved to be more appropriate than PCA. The results further support a hypothesis that different kinds of disordered regions exist, as all family-specific disorder predictors developed in this study significantly outperformed a previously reported general multi family disorder predictor.
View full abstract
-
Sivasundaram Suharnan, Takeshi Itoh, Hideo Matsuda, Hirotada Mori
1997Volume 8 Pages
125-134
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Motivation: Several methods in genetic information have recently been developed to estimate classification of protein sequences through their sequence similarity. These methods are essential for understanding the function of predicted open reading frames (ORFs) and their molecular evolutionary processes. However, since many protein sequences consist of a number of independently evolved structural units (we refer to these units as components), the combinatorial nature of the components makes it difficult to classify the sequences.
Results: This paper presents a new method for classifying uncharacterized protein sequences. As the measure of sequence similarity, we use similarity score computed by a method based on the Smith-Waterman local alignment algorithm. Here we introduce how this method cope when sequences have multi-component structure. This method was applied to predicted ORFs on the Escherichia coli genome and we discuss the algorithm and experimental results.
View full abstract
-
Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Watanabe
1997Volume 8 Pages
135-146
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed a powerful image processing system DNAinsight, which performs automated detection of several thousands of spots found on autoradiogram images obtained with 2-D gel electrophoresis of genomic DNA. Algorithms and parameters for detecting spot locations and intensities are carefully chosen so as to enable reliable and rapid processing of 2-D gel electrophoretograms based on the RLGS (restriction landmark genomic scanning) method. In DNAinsight, matching of several related spot patterns, such as those from tumor-cell and normal-cell, can be accomplished rapidly with easy operations, being solved by comparing the Delaunay net and relative neighborhood graph. The automated and accurate image processing system strongly supports the rapid identification and analysis of genetic variation in the DNA of humans and other animals.
View full abstract
-
Masaru Tomita, Tom Shimizu, Kanako Saito, J. Craig Venter, Kenta Hashi ...
1997Volume 8 Pages
147-155
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We present E-CELL, a generic computer software environment for modeling a cell and conducting experiments
in silico. The E-CELL system allows a user to define functions of proteins, protein-protein interactions, protein-DNA interactions, regulation of gene expression and other features of cellular metabolism, in terms of a set of reaction rules. The system then executes those reactions iteratively, and the user can observe, through a computer display, dynamic changes in concentrations of proteins, protein complexes and other chemical compounds in the cell.
Using this software, we constructed a model of a hypothetical cell with only 127 genes sufficient for transcription, translation, energy production and phospholipid synthesis. Most of the genes are taken from
Mycoplasma genitalium, the organism having the smallest known chromosome, whose complete 580kb genome sequence was determined at TIGR in 1995.
We discuss future applications of the E-CELL system with special respect to genome engineering.
View full abstract
-
Laurence Vignal, Frédérique Lisacek
1997Volume 8 Pages
156-165
Published: 1997
Released on J-STAGE: November 16, 2011
JOURNAL
FREE ACCESS
Given the problem of identifying exons in new genomic DNA, the sketch of a resolution process was drawn using sequence data and models of site/signal recognition. A multi-agent architecture is used to validate these models and test hypotheses on the chronology of events involved in gene splicing. Information is channelled through a hierarchy of agents. Each type of agent is the result of a successful step in the resolution process. The system does not rely on the compositional bias of coding sequences which is a key feature of current computer methods.
View full abstract
-
Frédéric Achard, Emmanuel Barillot, Gis Infobiogen
1997Volume 8 Pages
166-172
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper focuses on a specific type of information frequently used by researchers in Genetics: links between genome objects. It emphasizes the fact that, at present, links are not sufficiently characterized and describes our work to address this problem: the design of a prototype databank to store links between genome databases. Because this global repository is of concern for many people, we welcome and encourage feedback from the community.
View full abstract
-
Tatsuya Akutsu
1997Volume 8 Pages
173-179
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper describes simple DP (dynamic programming) algorithms for RNA secondary structure prediction with pseudoknots, for which no explicit DP algorithm had been known. Results of preliminary computational experiments are described too.
View full abstract
-
Tatsuya Akutsu
1997Volume 8 Pages
180-186
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper shows that the problem of finding a protein side-chain packing is computationally hard (NP-hard), where the problem is defined here as a combinatorial search problem using rotamer library. Although this result does not suggest a new method, it gives a justification for previous methods using such heuristics as simulated annealing, neural networks, genetic algorithms, and Gibbs sampling.
View full abstract
-
Winston Hide, John Burke, Alan Christoffels, Robert Miller
1997Volume 8 Pages
187-196
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
In order to provided a novel maximised approach to the generation of accurate, comprehensive, consensus sequences of the expressed human genome, we have developed and produced a system for a novel-representation, broad gene coverage, consensus database of expressed human gene fragments (ESTs). To perform clustering of ESTs, we have developed and employed D2-cluster, an algorithm based on the d2-search algorithm (Hide et al. 1994) specifically for EST clustering. D2-cluster does not require alignment in order to perform clustering (Burke, Davison and Hide, in prep). We have incorporated d2-cluster into a portable and novel system to perform clustering, alignment and automated error analysis of publicly available expressed sequence tags (STACKIPACK). The system includes a statistically robust algorithm that can detect and compensate for error within an aligned cluster of ESTs. We have manufactured a database of partial human consensus sequences from 552 013 ESTs from dbEST 040896 and TIGR. The database is termed Sequence Tag Alignment and Consensus Knowledgebase (STACK). STACK 1.0 contains 18 divisions based on tissue annotation identifying 204 431 unique sequences and generating 76 131 consensi which represent 321 134 ESTs. The consensus sequences have an average length of 497 bases, a 39% increase over the 357 base average length of the input data set. Clone Ids are used to join 92 759 unique sequences and 48 858 consensi into 61 632 linked sequences, averaging 900 bases each. The distribution of clusters compares favourably with UniGene, reflecting the difference in methodology of clustering and the higher input number of sequences into STACK. SANIGENE high accuracy database is also generated, consisting of sequences which agree in at least two ESTs. STACK is a distributable, core information resource upon which a comprehensive knowledgebase can be built.
View full abstract
-
Makoto Hirosawa, Katsumi Isono
1997Volume 8 Pages
197-206
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Previously, we developed a GeneMark-based procedure, termed GeneMark-RC, and applied it for the identification and classification of ORFs in genomic sequence data, and identified and characterized ORFs in the 1.0 Mb data of the cyanobacterium Synechocystis sp. strain PCC 6803. In the present study, we have improved the procedure and performed analysis of the whole genomic data of Synechocystis. Consequently, we noticed the presence of three distinct classes of ORFs in this organism. The prediction of ORFs by the class-specific GeneMark-RC analysis agreed with 97.9% of those described for this bacterium. Moreover, 124 additional ORFs were identified. The procedure was similarly applied to the genomic analysis of five other prokaryotes, and 2 to 3 classes of ORFs were recognized in each case. Common features were found among the ORFs identified in the six organisms including Synechocystis. Class 1 is composed of most typical ORFs whose GC content is slightly higher than the average, while Class 2 is composed of ORFs with GC contents lower than the average. It was found that ORFs of one species can be detected with the GeneMark-RC parameters obtained from other organisms, and the prediction rate is high when the difference in their GC contents is small. It was also found that ORFs of three species with relatively low GC contents can be nicely detected with the Synechocystis matrices of Class 2 ORFs whose GC content is similar to that of the three species. Therefore, although there are two to three classes of ORFs in each species, their di-codon statistics must be rather similar to each other if their GC contents are similar. A notable exception was the case of Methanococcus jannaschii, which might reflect the fact that it is an archaebacterium.
View full abstract
-
Andreas M. Kogelnik, Shamkant B. Navathe, Douglas C. Wallace
1997Volume 8 Pages
207-214
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed the Georgia Tech Emory Networked Object Management Environment (GENOME). GENOME is a prototype database management system (DBMS)/user interface system designed to manage complex biological data, allowing users to more fully analyze and understand relationships in human genome data. The system is designed to allow the establishment of a network of searchable data sources. The DBMS portion of the environment is a hybrid object-relational system which interprets its data structures on-the-fly, resulting in an extremely flexible DBMS. Such a DBMS provides an environment for interrelating distributed data items, allowing users to further explore computational questions in biomedical science in addition to other fields by maximizing access to data. In developing GENOME, we used MITOMAP, a human mitochondrial genome database, as a model genomic database. MITOMAP encompasses one of the most complete collections of genomic data available for a specific locus or chromosome, including functional, population variation, disease mutation, and gene-gene interaction data, as well as complete sequence data for the human mitochondrial chromosome, and thus serves as an excellent model system. An effective DBMS is required for handling the plethora of Human Genome Project data to handle the various locus-specific databases and ultimately to unify all human genetic and biomedical information through the complete human genome sequence. Developing such a DBMS is our goal. We expect that GENOME will be generally applicable to other biological and non biological paradigms as well.
View full abstract
-
Éric Rivals, Jean-Paul Delahaye, Max Dauchet, Olivier Delgrange
1997Volume 8 Pages
215-226
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Long direct repeats in genomes arise from molecular duplication mechanisms like retrotransposition, copy of genes, exon shuffling, ... Their study in a given sequence reveals its internal repeat structure as well as part of its evolutionary history. Moreover, detailed knowledge about the mechanisms can be gained from a systematic investigation of repeats. The problem of finding such repeats is viewed as an NP-complete problem of the optimal compression of a sequence thanks to the encoding of its exact repeats. The repeats chosen for compression must not overlap each other as do the repeats which result from molecular duplications. We present a new heuristic algorithm, Search_Repeats, where the selection of exact repeats is guided by two biologically sound criteria: their length and the absence of overlap between those repeats. Search_Repeats detects approximate repeats, as clusters of exact sub-repeats, and points out large insertions/deletions in them. Search_Repeats takes only 3 seconds of CPU time for the genome of Haemophilus influenzae on a Sun Ultrasparc workstation.
View full abstract
-
Kiyotaka Shiba, Hiromi Motegi, Tetsuo Noda
1997Volume 8 Pages
227-233
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Aminoacyl-tRNA synthetases (ARSs) are believed to have arisen early in the evolution of life as the essential components that establish the link between triplet codons and amino acids. We have cloned and sequenced eight cDNAs for human cytoplasmic ARSs. Along with twelve sequences that have been reported from other laboratories, a set of 20 human cytoplasmic ARS genes is now available. We compared these human ARSs with -400 sequences of ARS currently available from various organisms and deduced the possible evolutionary history of these enzymes. The availability of complete sets of ARSs from thirteen organisms (H. sapiens, S. cerevisiae, E. coli, H. influenzae, H. pylori, N. gonorrhoeae, S. pyogenes, M genitalium, M. pneumoniae, Synechocystis sp., M jannaschii, M. thermoautotrophicum, and A. fulgidus) made systematic analyses of the evolution of this gene family possible. In this paper, we will focus on two topics;(1) the acquisition of new structural domains to the core enzyme domains in higher eukaryotes and their possible role in the formation of multi-synthetase supra-molecular complexes, and (2) the existence of eukaryotic-like ARSs in some bacterial genomes, and the relationship of this occurrence to tRNA recognition.
View full abstract
-
Jianghong An, Yasushi Kubota, Takao Nakama, Akinori Sarai
1997Volume 8 Pages
234-235
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have created an integrated database, search and visualization tool, named 3DinSight, to help researchers to get insight into the relationship of structure, function and property of biomolecules. Various kinds of searches can be carried out though WWW interfaces. The locations of motif sequences and mutations are automatically mapped on the structure, and visualized in 3D space by interactive viewers, VRML (Virtual Reality Modeling Language) and RasMol. In the case of VRML, the mapped 3D objects are hyper-linked to the corresponding document data. The amino-acid properties of structure, functional and mutation sites, can be displayed as graph plots. 3DinSight is freely accessible through the URL http://www.rtc.riken.go.jp/3DinSight.html.
View full abstract
-
Genome Information Broker and the Enhancement of SAKURA
Kousuke Goto, Toshitsugu Okayama, Hirotada Mori, Hikaru Yamamoto, Tomo ...
1997Volume 8 Pages
236-237
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Data submitters, reviewers and users of DNA Data Bank of Japan (DDBJ) processes sequence data longer than 1M base pairs thanks to genome projects. In order to realize smooth and reliable submission, annotation and dissemination of the large scale genetic information, DDBJ developed systems which visualize sequences and relevant biological information. A newly developed data dissemination system named Genome Information Broker and the enhancement of Web data submission system SAKURA are introduced here from the viewpoint of visualization.
View full abstract
-
Yasuhiko Kitamura, Tetsuya Nozaki, Shoji Tatsumi, Akira Tanigami
1997Volume 8 Pages
238-239
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
MetaCommander is developed as a generic tool to retrieve and integrate information from WWW servers by interpreting a script. By using MetaCommander, we can support genome information processing on WWW browsers in various ways.
View full abstract
-
Toward Software Agent for Genome Information Analysis
Hiroshi Matsuno, Manabu Hori, Nobuaki Wada, Miyako Tanaka
1997Volume 8 Pages
240-241
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Hiroshi Matsuno, Misako Ichimura, Tatsumi Fukuyama, Miyako Tanaka
1997Volume 8 Pages
242-243
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
S. Minoshima, S. Mitsuyama, S. Ohno, T. Kawamura, N. Shimizu
1997Volume 8 Pages
244-245
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Tadashi Mizunuma, Sadahiko Misu, Motonori Ota, Ken Nishikawa
1997Volume 8 Pages
246-247
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Kotoko Nakata, Takako Igarashi, Tsuguchika Kaminuma
1997Volume 8 Pages
248-249
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
A database for receptors on cell membrane has been developed. The system can collect data items such as attributes of proteins from distributed data sources on the Internet. Such sources include internationally standard biological databases such as the updated genetic database of PIR, Swiss Prot, PDB, GenBank, EMBL and GDB. The system provides various viewing tools that effectively displays different types of receptor data; DNA sequences, amino acids sequences, DNA binding sites, ligand binding sites, gene and disease information, and the protein structural information. It can also display three dimensional images using a freeware program RASMOL. DNA binding sites, ligand binding sites and active sites are classified by coloring the sequences. PDB matching sites are classified by italicization. CSNDB (Cell Signaling Networks Database), which is a database for cellular signal transduction of human is also linked in the system. The database may be useful for quick reference for ligand-membrane receptors and signal transduction in the drug design.
View full abstract
-
Tadasu Shin-i, Yuji Kohara
1997Volume 8 Pages
250-251
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed a WWW-based database, named NEXTDB, to integrate all the information of ESTs (tag-sequences of cDNA clones) and gene expression patterns of C. elegans which are being produced and analyzed in this laboratory. NEXTDB incorporates and processes raw data of tag sequencing and classifies them into unique cDNA groups by comparing the 3'-tags. The database contains the information on map position of the cDNA groups, correspondence to predicted CDSs and homologies to other organisms' genes. NEXTDB incorporates image data of in situ hybridization which show the expression patterns of individual cDNA groups and provides us a platform for annotation of the images. The database also contains the cosmid contig maps obtained from AceDB. All of the information are linked each other in NEXTDB, which can be accessed through the internet.
View full abstract
-
DNAinsight
Takayuki Toda, Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Watanabe
1997Volume 8 Pages
252-253
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Tatsuya Akutsu, Satoru Kuhara, Osamu Maruyama, Satoru Miyano
1997Volume 8 Pages
254-255
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Tatsuya Akutsu, Akira Ohyama, Kyotetsu Kanaya, Asao Fujiyama
1997Volume 8 Pages
256-257
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have been developing an image analysis system named DDGEL for 2D gel electrophoresis of genomic DNA. Recently, we have developed a program module for finding a correspondence of spots between two gel electrophoresis images.
View full abstract
-
Tatsuya Akutsu, Hiroshi Tashimo
1997Volume 8 Pages
258-259
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have been developing a novel method of deriving a score function for protein threading. In this method, the constraint that the score of the native threading is minimum over all possible threadings is expressed in a form of linear inequalities, and then parameters defining a score function are determined by solving these inequalities. The proposed method was evaluated using Lathrop and Smith's algorithm for finding optimal threadings and was shown to be effective for computing nearly correct threadings.
View full abstract
-
Hidemasa Bono, Susumu Goto, Hiroyuki Ogata, Minoru Kanehisa
1997Volume 8 Pages
260-261
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Predicting gene functions from the whole genome sequence is an important problem in a postgenome era. We are developing a function predicting system from the whole genome sequence utilizing the functionally well annotated genome as a reference organism for the knowledge of biologically well known pathways. The databases of gene catalogs and pathways are compiled under the KEGG project. In this paper we show an instance for identifying functions of genes involved in the two-component signal transduction system.
View full abstract
-
Hiroki Fukasawa, Shigehiko Kanaya, Yoshihiro Kudo
1997Volume 8 Pages
262-263
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Takamasa Futatsuki, Yuichi Kawanishi, Kimitoshi Naito, Satoru Miyazaki ...
1997Volume 8 Pages
264-265
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We analyzed and implemented Smith and Waterman algorithm and maximum likelihood method into the vector-parallel computer of Fujitsu VPP500. The programs optimized for the computer are ssearch, clustalw and fastDNAml. Our goal is to develop a total system which will cover all processes from database search to the construction of large scale phylogenetic trees on super-computer.
View full abstract
-
Osamu Gotoh
1997Volume 8 Pages
266-267
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Qian-Ping Gu, Kazuyuki Iwata, Shietung Peng, Qi-Ming Chen
1997Volume 8 Pages
268-269
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Shugo Hamahashi, Hiroaki Kitano
1997Volume 8 Pages
270-271
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Embryogenesis is one of the most important and mysterious process of animal's development. The embryogenesis is quite complex and hard to be understood because it has too many elements, such as cells or nuclei, which interact with each other. We replicated the system of Drosophila's early segmentation by using computer. Computer simulation enables us to understand a whole system of animal's development. The work reported here is an attempt to observe the mechanism of segmentation during the early development of Drosophila in detail by using computer simulation, which is a part of Virtual Drosophila project.
View full abstract
-
Yoshitomo Harada, Masato Wayama, Toshio Shimizu
1997Volume 8 Pages
272-273
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Junichi Isamikawa, Katsutoshi Takahashi, Masayuki Nakazawa, Yasuo Wata ...
1997Volume 8 Pages
274-275
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Masahiro Hattori, Minoru Kanehisa
1997Volume 8 Pages
276-277
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
For the purpose of an analysis of apoptotic molecular interactions, we have developed the knowledge base on apoptosis, which consists of molecular interactions concerning apoptosis reported in experimental papers. We have collected about 80 entries, where one entry is corresponding to one molecule, and each entry contains their interaction information.
View full abstract
-
Mika Hirakawa, Kensaku Imai, Hiroko Yamaguchi, Junko Shimada, Kazuo Ta ...
1997Volume 8 Pages
278-279
Published: 1997
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
The goal of the Advanced Life Science Information Systems (ALIS) project is construction of an entire human genome database that will provide an efficient source of information for researchers after the human genome has been sequenced. We have initiated this project to encourage large scale human genome sequencing and to develop systems for genome data management and data publishing by World Wide Web. It has been 2 years since the project began and our first attempt at human genome sequencing is going well and more than 4M bases of well-edited human genome sequences have been acquired. The human genome project is progressing and international consensus releasing data generated from the project has been defined. We have been improved on our sequencing database to adapt the situation. Recently we organized collection and publication system for the genome sequencing data.
View full abstract