Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
Volume 7
Displaying 1-50 of 78 articles from this issue
  • Louxin Zhang
    1996 Volume 7 Pages 1-12
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    A conjecture of Mirkin, Muchnik and Smith is answered affirmatively which connects the inconsistency function, a biologically meaningful dissimilarity measure for a gene and species tree, to the mutation cost function, a combinatorial measure based on mapping of trees. A linear-time algorithm for computing the inconsistency function is also derived from the conjecture.
    Download PDF (1060K)
  • sorting signed permutations by reversals and transpositions
    Qian-Ping Gu, Shietung Peng, Hal Sudborough
    1996 Volume 7 Pages 13-22
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Recently, a new approach to analyze genomes evolving was proposed which is based on comparison of gene orders versus traditional comparison of DNA sequences (Sankoff et al, 1992). The approach is based on the global rearrangements (e.g., inversions and transpositions of fragments). Analysis of genomes evolving by inversions and transpositions leads to a combinatorial problem of sorting by reversals and transpositions, i. e., sorting of a permutation using reversals and transpositions of arbitrary fragments. The problem is conjectured as NP-hard. We study sorting of signed permutations by reversals and transpositions, a problem which adequately models genome rearrangements, as the genes in DNA are oriented. We establish a lower bound and give two algorithms for the problem. Based on the lower bound, we show that the first algorithm is a 2-approximation algorithm. The time complexity of the algorithm may not be bounded by Poly (n), where n the length of the permutation to be sorted. Setting a time limit to the first algorithm, we get the second algorithm which is a 2 (1+1/k)-approximation one, where k ≥ 3 is any fixed integer, and runs in Poly (n) time.
    Download PDF (985K)
  • H. Matsuda, T. Ishihara, A. Hashimoto
    1996 Volume 7 Pages 23-32
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper presents a method for clustering a large and mixed set of uncharacterized sequences provided by genome projects. As the measure of the clustering, we use a fast approximation of sequence similarity (FASTA score). However, in the case to detect similarity between two sequences that are much diverged in evolutionary process, FASTA sometimes underestimates the similarity compared to the rigorous Smith-Waterman algorithm. Also the distance derived from the similarity score may not be metric since the triangle inequality may not hold when the sequences have multi-domain structure. To cope with these problems, we introduce a new graph structure called p-quasi complete graph for describing a cluster of sequences with a confidence measure. We prove that a restricted version of the p-quasi complete graph problem (given a positive integer k, whether a graph contains a 0.5-quasi complete subgraph of which size≥k or not) is NP-complete. Thus we present the outline of an approximation algorithm for clustering a set of sequences into subsets corresponding to p-quasi complete graphs. The effectiveness of our method is demonstrated by the result of clustering Escherichia coli protein sequences by our method.
    Download PDF (1045K)
  • Fei Shi, Peter Widmayer
    1996 Volume 7 Pages 33-40
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We are given a finite set S of text strings and a pattern P over some fixed alphabet Σ. The topic of this paper is the design of a data structure D (S) which supports approximate multiple string searching queries efficiently. Thereby, for a given upper bound k ∈ Z+ on the allowable distance, P=p1...pm is said to appear approximately in a text T=t1...tn, m, n ∈ Z+, if there exist positions u, v in T such that the edit distance between P and tu...tv is at most k. Let N denote the sum of the lengths of all strings in S. We present an algorithm that constructs the data structure D (S) in O (N) time and space. Afterwards, an approximate multiple string search query can be answered in 0 (N) expected-time if the allowable distance k is bounded above by 0 (m/log m). The method can be used to search large nucleotide and amino acid sequence databases for similar sequences.
    Download PDF (852K)
  • Tetsuo Shibuya, Hiroshi Imai
    1996 Volume 7 Pages 41-50
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    The alignment problem of DNA or protein sequences is very applicable and important in various fields of molecular biology. In this problem, the obtained optimal solution with fixed parameters (gap penalties, weights for weighted multiple alignment problems, and so on) is not always the biologically best alignment. Thus, it is required to vary parameters and check the varying optimal alignments. The way to vary parameters has been studied well on the problem of only two sequences [6, 7, 12, 13, 14, 15], but not in the multiple alignment problem because of the difficulty of computing the optimal solution. This paper presents techniques for parametric multiple alignment problem, and examines the features of obtained alignments by parametric analysis on gap penalty and weight matrix through experiments. These experiments reveal the importance of adopting appropriate parameter values to obtain meaningful multiple alignments.
    Download PDF (963K)
  • Michiyo Yamaguchi, Shinichi Shimozono, Takeshi Shinohara
    1996 Volume 7 Pages 51-60
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We propose a learning algorithm that discovers a motif represented by patterns and an alphabet indexing from biosequences. From only positive examples with the help of an alphabet indexing, the algorithm finds k regular patterns as a k-minimal multiple generalization (k-mmg for short). The computational results for transmembrane domains indicate that the combination of k-mmg and alphabet indexing works quite successful. We also introduce a partial alphabet indexing that transforms symbols dependently on the position in sequences.
    Download PDF (930K)
  • Comparison of the Diversity Among Bacteria and Prediction of the Protein Production Levels in Cells
    Shigehiko Kanaya, Yoshihiro Kudo, Shinya Suzuki, Toshimichi Ikemura
    1996 Volume 7 Pages 61-71
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    In the present study, we have developed the procedure for estimating species-specific heterogeneous codon usage among intraspecific genes called diversity in codon usage and for systematizing species by the species-specific diversity on the basis of principal component analysis. We tried to quantify differences of the diversity among five species, Escherichia coli (Ec), Salmonella typhimurium (St), Haemophilus influenzae (Hi), Bacillus subtilis (Bs), and Synechocystis sp.(Ss). In the five species, many of genes involved in the translation process and energy metabolism had positive values (Z1>0) on the first principal component (PC1). In Ss, many of genes involved in photosynthetic system had also postive Z1-values. These genes are thought to be highly expressed. By the direction of PC1, the five species were roughly classified into three categories, [Ec, St, Hi], [Ss], [Bs]. The dendrogram constructed was roughly consistent with the rRNA-based phylogeny, but interesting differences were also observed between the two phylogenic trees.
    Download PDF (946K)
  • Kenta Nakai
    1996 Volume 7 Pages 72-81
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Since signal peptides play a crucial role for specifying the in-vivo fate of proteins, prediction of their existence is important for the characterization of ORFs of unknown function. To make such predictions as reliable as possible, the features of signal peptides of two important model organisms, Saccharomyces cerevisiae and Bacillus subtilis, were examined and the accuracy of current prediction methods was refined using these data. Direct optimization of the threshold values of existing methods significantly raised the predictability but the variables that were most effective for improvement were different in these two organisms. In yeast, the maximum hydrophobicity value of an 8-residue segment mainly contributed to raising the predictability to 98.5% when estimated by the cross validation procedure. In Bacillus species, the length of uncharged segment and the charges in the N-terminal region (net charge and negative charge) were combined to give a prediction accuracy of 98.2% although the data size was relatively small in this case.
    Download PDF (1000K)
  • M. S. Gelfand, T. V. Astakhova, M. A. Roytberg
    1996 Volume 7 Pages 82-87
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Since absolutely reliable recognition of protein-coding regions in eukaryote genomic DNA sequences by computational methods is unattainable, most existing algorithms try to keep some balance between underprediction and overprediction. However, in experimental practice it is often sufficient to have just a few protein-coding segments, but predicted with high specificity, that is, with (almost) no overprediction. Such predictions are then used for construction of oligonucleotide probes and PCR primers for analysis of cDNA libraries or total cellular RNA.
    Here we present a combinatorial algorithm solving this problem. Unlike other prediction schemes, the algorithm uses only the simplest statistical parameters (codon usage and positional nucleotide sequences in splicing sites) and thus can be used for analysis of obscure genomes, when large learning sets are unavailable. The algorithm's structure allows one to simply tune it for various experimental settings.
    Download PDF (550K)
  • Kiyoshi Asai, Tetsushi Yada, Katunobu Itou
    1996 Volume 7 Pages 88-97
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    A new method for combining protein motif dictionary to gene finding system is proposed. The system consists of Hidden Markov Models (HMMs) and a dictionary. The HMMs represents the nucleotide acid bases, the codons, and the amino acids. The ‘words’ in the dictionary is described by the sequence of these HMMs and represent the noncoding regions, the codons, protein motifs, tRNA regions and signals in DNA sequences. The statistics between these regions are expressed by the “grammar”, which is a stochastic network of the ‘words’.
    Using the same kind of technique of speech recognition by HMMs with a word dictionary and a grammar, the stochastic network of ‘words’ enables the motif dictionary to be used during the parsing of the DNA sequences. At the same time, the information of the di-codon statistics, which are known as the important parameters, is included in the stochastic network. As a result, while the system parses DNA sequences and finds the coding regions, the protein motifs are automatically annotated in the regions. It helps to identify the functions of the genes and reduces the cost of homology search for each hypothetical coding regions. This method is different from simply using the the information of homology search. This method uses the information of the motif patterns during the parsing process, but searching the motif patterns after/before finding the coding regions cannot directly affect the parsing process itself. Experimental results have shown that this method correctly finds and annotates the motifs in the coding regions in the DNA sequence of cyanobacterium.
    Download PDF (3041K)
  • Naohiro Furukawa, Satoshi Matsumoto, Ayumi Shinohara, Takayoshi Shouda ...
    1996 Volume 7 Pages 98-107
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We developed a machine learning system HAKKE which is suitable for predicting functional regions from sequences, such as protein-coding region prediction, and transmembrane domain prediction. HAKKE is a hybrid system cooperated by a number of algorithms of a pool to make an accurate prediction. The system uses an extension of the weighted majority algorithm in order to fit the strength of each algorithm into given training examples. In this paper, we describe the core of the system and show some experimental results on transmembrane domain and a-helix predictions.
    Download PDF (2994K)
  • Carlos A. Del Carpio, Valentin Gogonea
    1996 Volume 7 Pages 108-118
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This article is the first of a series of papers describing the development of an automatic system for prediction of the three dimensional conformation of proteins in solution. In this first part we discuss the implementation of the protein conformational space mapping engine. This is a procedure based on a robust parallel genetic algorithm which runs on a network of transputers. We describe aspects of the algorithm related to the major factors that influence the protein folding process and describe their implementation within the scheme of the evolutionary algorithm. Among them, we make a throughout review of the co-operativity of emergent partial secondary structures as the evolutionary process proceeds and its effects on the stability of new generated conformers as well as a better performance of the GA. We then undertake the hydrogen bond and synthesize the demographic trends in known proteins suggested by Stickle et. al., and also implement them as an index of goodness assessment of the generations of protein conformers. Finally, we make an intensive analysis of the packing of the amino acid side chains and show how a hybrid algorithm can utter a relaxation of the perturbations brought about by the operations of the GA, and the genuine improvement of the overall process. In the second paper of this series we propose guidelines under which we implement the solvent effect which in concourse with the above mentioned factors results in a system for protein 3D structure prediction in solution.
    Download PDF (1139K)
  • Nickolai N. Alexandrov, Victor V. Solovyev
    1996 Volume 7 Pages 119-127
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Hydrophobic long-range interactions and local polypeptide chain propensities are the major factors directing protein folding. Incorporating both these terms in addition to the Dayhoff matrix helps us to increase quality of protein fold recognition via sequencestructure alignment. We have shown that the results of secondary structure prediction substantially increase a sensitivity of the fold recognition. To measure a performance of the protein fold recognition, we have developed a comprehensive test along with a set of the quality control scores based on the most populated structural families. With this test we have demonstrated improvement of the sequence alignment with consideration of the predicted secondary structure, even without knowledge of the real three-dimensional structure.
    Download PDF (2615K)
  • Hiroyuki Ogata, Wataru Fujibuchi, Hidemasa Bono, Susumu Goto, Minoru K ...
    1996 Volume 7 Pages 128-136
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    In conjunction with a new database system that efficiently organizes the metabolic pathway data from various organisms, we are developing computational methodologies using binary relations and hierarchies of enzymes. Biological knowledge integrated in the system includes genes, gene products, chemical compounds, enzyme reactions and metabolic pathway diagrams. By automatically mapping the enzymes of a specific organism on the pathway diagrams, it becomes possible to visualize the characteristic features of the organismspecific metabolic pathways. With the aid of the computational methodology implemented in the system, it becomes again possible to analyze and investigate the pathways in terms of their function and evolution. In this paper, we describe the outline of the system and present new biological features of metabolic pathways revealed by the system.
    Download PDF (811K)
  • Yasuhiko Kitamura, Tetsuya Nozaki, Hideyuki Nakanishi, Teruhisa Miura, ...
    1996 Volume 7 Pages 137-146
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    With the advance of the Human Genome Project, a huge amount of various genome data has been stored in a number of databases and the WWW system is widely used to access these databases. From the viewpoint of information supplier, the WWW is a quite useful tool to provide various types of data easily, but from the viewpoint of information consumer, it is not good enough because of lack of rigid data format and difficulty of data access. In this paper, by extending a current WWW browser, we propose two generic WWW tools; MetaViewer and MetaCommander, and try to apply them to the genome informatics to support researchers who search, analyze, and dispatch genome data, and discuss their potential advantages from the viewpoint of information consumer.
    Download PDF (8510K)
  • R. Gras, J. Nicolas
    1996 Volume 7 Pages 147-156
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We present a new tool, FOREST, aiming at representing the content of a large nucleic acid sequence (e.g.> 100KB) in a suitable form for the biologist. More precisely, FOREST builds all subsequences repeated in a sequence or a set of sequences. It allows not only to look for the location of the various occurrences of a given subsequence but points also to interesting subsequences with respect to a given criterion. This tool is based on two key ideas. The first idea consists to build a suffix-tree representation of a sequence and to associate to each node of this tree a set of synthesized attributes, computed on the set of subsequences under this node. This allows the biologist to “browse” in the sequence with a constant abstract view of what he may expect to find in the section of the tree he is currently investigating. The second idea consists to summarize the distribution of the information with boolean vectors associated to the sequence. These vectors may be easily displayed in form of a linear map of events, as it is done in genetic mapping. Both representations allow various efficient operations on the sequence. They provide a powerful filtering capacity of the data, while reducing the set of elementary filtering operations to a minimum of conceptual operations. This allows the biologist to easily investigate the most prominent features of the lexical structure of its sequences.
    Download PDF (6773K)
  • T. Koike, T. Okayama, J. Ishii, T. Mizunuma, T. Tamura, Y. Tateno, H. ...
    1996 Volume 7 Pages 157-165
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    As the molecular biology has made a rapid progress these years, there has been a great number of changes required of the methodology for maintaining and utilizing DNA sequence data. For example, annotation to sequences has become complex and extensive. DDBJ which recognized the impending requirements decided to develop a new DNA sequence database system in 1995. To tolerate with frequent changes of the data structures and significant increment of the data in terms of quality and quantity, we designed a completely new database schema. In the new system, physical changes of the data structure do not affect such applications as a tool for annotation. We also designed a new annotation tool with object oriented concept that allows us to handle DNA sequence data in computers as intuitively as in the real world. The annotation tool is named as YAMATO II. We also take care of needs from DDBJ itself in the new system. Data traffics and security in the database access are especially analyzed and reviewers of data for DDBJ who are distant from DDBJ are now able to process the data safely and comfortably in the new system. The new system also realized more robust and effective data exchange with partners in the international nucleotide sequence banks, EMBL and GenBank.
    Download PDF (4909K)
  • Hajime Kitakami, Yasuma Mori, Masatoshi Arikawa
    1996 Volume 7 Pages 166-167
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We developed a taxonomy database system for managing multimedia contents. The system is accessible from remote users through World-Wide Web and is implemented in SQL programming and CGI (Common Gateway Interface) scripts of World-Wide Web.
    Download PDF (2574K)
  • Hiroaki Kato, Yoshimasa Takahashi
    1996 Volume 7 Pages 168-169
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper describes an approach to automated identification of three-dimensional (3-D) motif in proteins. Here, the structure of a protein was reduced into abstract representation which consists of the α-helix and β-strand secondary structure elements, these being described by vectors in 3-D space rather than the point-like atoms that are used in the simple Ca approximation. The algorithms and the implementations are discussed with a couple of execution examples of the identification of the 3-D motif candidates using well known motifs.
    Download PDF (212K)
  • Takayuki Kamei, Yasuo Yonezawa
    1996 Volume 7 Pages 170-171
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    It is well-known that enzymes is very important as reaction factor in life systems activity. But the properties based on information theory are not yet enough in biological studies. Then, we examined correlation the complexity at amino acid sequences with its function of Enzymes by informational measure, in order to elucidate the informational properties of sequence structure. Also, power spectrum of enzyme complexity are obtained specific profile by Fourier Transform (FT) method. At results, correlation at sequence complexity, the sequence of enzyme Proteins are given complexity more than non-enzyme Proteins. Moreover, FT profile are given typical pattern at complexity of enzyme Protein sequences. This result are suggested that the new view-point for Protein analysis by information Science.
    Download PDF (212K)
  • Y. Wada, H. Yasue
    1996 Volume 7 Pages 172-173
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed a new version of the Animal Genome Database for network users using Java applets. A new version has included linkage homology map, Java version of clickable linkage map, Japanese tutorial with audio clip. Furthermore, we have started the Mouse Genome Informatics mirror site in Japan.
    Download PDF (2338K)
  • A. Nakaya, A. Yonezawa, K. Yamamoto
    1996 Volume 7 Pages 174-175
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (136K)
  • Masamichi Isokawa, Masato Wayama, Toshio Shimizu
    1996 Volume 7 Pages 176-177
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (206K)
  • Takashi Ishikawa, Shigeki Mitaku, Takao Terano, Makiko Suwa, Takatsugu ...
    1996 Volume 7 Pages 178-179
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This research develops a method for discovering functional sites of amino acid sequences using an Inductive Logic Programming (ILP) method with sorted variable generalization. Functional sites provide clues to building a knowledge base for prediction of protein functions from amino acid sequences. The proposed method generates hypotheses of functional sites directly from aligned amino acid sequences using an ILP method extended with sorted variable generalization. The proposed method is shown to be useful for discovering functional sites by an example application to the case of bacteriorhodopsin-like proteins.
    Download PDF (268K)
  • Takatsugu Hirokawa, Boon-Chieng Seah, Shigeki Mitaku
    1996 Volume 7 Pages 180-181
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    A new method to predict the transmembrane helices from amino acid sequences was developed, in which the effect of the stabilization of helices by interhelix binding was taken into account. It was assumed that there are three stages of transmembrane helix conformation: the binding to membrane surface, the formation of transmembrane core region, and the maturation of helix due to the tertiary structure formation in membrane. This method was applied to the amino acid sequences of membrane proteins whose number of transmembrane helix are given, and most transmembrane helices were truly predicted.
    Download PDF (211K)
  • Takanori Washio, Masahiko Wada, Masaru Tomita
    1996 Volume 7 Pages 182-183
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (163K)
  • Yasuhiro Asakawa, Masaru Tomita
    1996 Volume 7 Pages 184-185
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    An interesting correlation between G+C contents and the lengths of primate introns have beenfound by our computer analysis.
    All sequences of primate introns were extracted from the Genbank database and classifiedinto subgroups according to their lengths (the number of bases; increment of 100). G+C contents (%) were then calculated for each subgroup.
    The results indicate that shorter introns tend to contain more G and C nucleotides, andlonger introns contain A and T nucleotides.
    Frequencies of each nucleotide for each subgroup are shown in figure 1.
    We also computed G+C contents of exons flanking those introns for each subgroup. As wecan see in figure 2, the similar but weaker tendencies are observed.
    Biological significance of those observations is currently under investigation. We also intendto extend our analysis to other eukaryotes.
    Download PDF (2355K)
  • Tom Shimizu, Kouichi Takahashi, Masaru Tomita
    1996 Volume 7 Pages 186-187
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (194K)
  • Rintaro Saito, Hidekazu Sasaki, Yuko Osada, Masaru Tomita
    1996 Volume 7 Pages 188-189
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (2278K)
  • Yuko Osada, Ryo Matsushima, Masaru Tomita
    1996 Volume 7 Pages 190-191
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (188K)
  • Michiko Muraki, Masaru Tomita
    1996 Volume 7 Pages 192-193
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (2593K)
  • Yoshimi Toda, Masaru Tomita
    1996 Volume 7 Pages 194-195
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    As been described above, Alu subfamily classification, direct repeats, and poly-a tails can be used as markers to refine sequence analysis and infer history of duplication events with high degree of confidence.
    Download PDF (2823K)
  • Y. Fujiwara, M. Asogawa
    1996 Volume 7 Pages 196-197
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper examines a method to normalize a score of a stochastic motif, represented by a hidden Markov model (HMM). The accuracy of the Z score method, which is one ofthe score normalization method, is compared with that of the whole search method.
    Download PDF (195K)
  • Minoru Asogawa
    1996 Volume 7 Pages 198-199
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (178K)
  • K. Nakata, T. Igarashi, M. Hayakawa, T. Kaminuma
    1996 Volume 7 Pages 200-201
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed a database of receptors, which gather data from information sources on the Internet. The source of this database is a variety of genomic and biological information on the internet; PIR, Swiss Prot, PDB, GenBank, EMBL, GDB, etc…The system provides the detail structure and functional information on receptors, such as ligand binding site and DNA binding site, which were picked up from the references, and the three dimensional structures. The system was implemented on the unix workstation (IRIS, INDIGO 2), using an object oriented database management system ACEDB (A Caenorhabditis elegans Data Base).
    ACEDB is an object oriented database management system, which has been developed as part of the Caenorhabditis elegans genome research. This database is a generalized genome database, and can be used to create new database without the need for any reprogramming or in fact any sophisticated computer skills.
    The system provides various viewing tools that effectively display different types of receptor data; DNA sequences, amino acids sequences, DNA binding sites, ligand binding sites, gene and disease information, and the protein structural information. It can also display three dimensional structure of molecules using a freeware molecular graphics RASMOL. The detail information for ligand and signal transduction, which are picked up from references, are also included. The system has also a browser interface so that database can be accessed via World Wide Web. The information regarding the sites of action on the receptor are highly interesting in biologically, medically and pharmacologically. The database may be useful for quick reference for ligand-membrane receptors and signal transduction in the drug design. We may use the database for the functional and structural analyses of receptors.
    Download PDF (2166K)
  • Naoko Kasahara, Keiichi Nagai, Susumu Hiraoka
    1996 Volume 7 Pages 202-203
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    We have developed a method based on a dynamic programming method, that enables us directly compare DNA and amino acid sequences. This method makes it possible to find homologies between translated DNA sequences and amino acid sequences by recognizing gaps in both types of sequences. This method allows higher sensitivity and specificity than possible with BLASTX, which has a similar function. To reduce the computation time, we performed a parallel computation on a workstation cluster using a PVM (Parallel Virtual Machine) programming.
    Download PDF (204K)
  • Hikaru Yamamoto, Takuro Tamura, Katsumi Isono, Takashi Gojobori, Hidea ...
    1996 Volume 7 Pages 204-205
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (196K)
  • Hiroaki Inayoshi, Hitoshi Iba
    1996 Volume 7 Pages 206-207
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This paper describes a model of the artificial chemical world and its computer simulation, in which rhythms emerge. The model specifies four items of the artificial chemical world:(1) components (five kinds of particles and DNA having Genetic Switches);(2) space (2-dimensional polar grids);(3) simple reaction rules (construction and destruction of molecules, etc.);(4) simple behavioral rules (stochastic movements and stochastic collisions, etc.); The simulation demonstrates the capability of the system to exhibit emergent behavior: that is, global order of the system (regular rhythms in this case) emerges out of randomness (thorough stochastic movements and collisions) of its components.
    Download PDF (2395K)
  • Yutaka Ueno, Kiyoshi Asai
    1996 Volume 7 Pages 208-209
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Studies of interaction between protein molecules sometimes require visualizing huge numbers of atoms in a molecular graphics pictures. Namba et al.[1] has reported that simplification and enhancement makes molecular pictures informative in their structural study of protein and nucleic acid in the tabaco mosaic virus. Their pictures have boundary outlines to distinguish different monomers which are symmetrically packed to form the virus, but it is not accomplished by currently available molecular graphics softwares. As novel research in structure biology increased, we will need more functions for graphics software to meet our biological interest. However most software are hard to modify and not expected to be improved on a specific request.
    A new software development project of an extensible protein visualization program for structure analysis and prediction study has started for this demand. Our goal is to provide a software platform which runs on common hardware and allow users to add new functions with average programming skill. Our first version is a structure viewer program of proteins in PDB database.
    In this project, an application supporting library was designed together with a target program to lead clear prospect of the complicated programming. Among number of technical issues for building a graphics software, 3d-graphics library and memory management functions are redesigned for fast drawing of large number of atoms. An original plug-in module function and a graphical user interface tool kit is also designed. This plug-in module was implemented by dynamic linking system calls in Unix system. The program can be configured with necessary modules from numbers of viewing and analysis functions for the software which we will develop eventually. Also a special calculation function using atomic coordinate data can be added by writing a new plug-in module. In contrast, macro language has been used in some systems, it never be faster and powerful than a binary code of plug-in module. A robust module interface design is now revised.
    Prototyping has completed on Unix with X-Window system. This first version has basic protein visualization features, such as several molecular model representation, rotation and two new features: 1) boundary outline to distinguish different molecules; 2) amino acid sequence windows are linked to 3-dimensional viewing window of the protein, where a selection echo is shown also in another window. It gives us a nice tracking view of peptide chains on navigating large proteins. Several examples of protein pictures made by this prototype will be presented in poster: a molecular interaction study of muscle proteins. Actin (45kD) and Myosin (head sub fragment S-1: 120kD) which are known to interact to generate force. Actin forms a filament in muscle, so several Actin monomer should be drawn, and one or two Myosin would interact in a picture. This case will be more than 4000 of alpha-carbons.
    Our program was written in C with Xlib and ordinary libraries and going to be released forUnix systems. Versions for personal computers are also planed to take advantage of their high potential in hardware.
    Download PDF (153K)
  • Tetsushi Yada, Yasushi Totoki, Masato Ishikawa, Kiyoshi Asai
    1996 Volume 7 Pages 210-211
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (2286K)
  • Makoto Hirosawa, Tetsushi Yada
    1996 Volume 7 Pages 212-213
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (208K)
  • Yasukazu Nakamura, Nobuyuki Miyajima, Makoto Hirosawa, Takakazu Kaneko ...
    1996 Volume 7 Pages 214-215
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (2572K)
  • As a Part of ALIS
    Mika HIRAKAWA, Kensaku IMAI, Akira OHYAMA, Fumihiko KIKUCHI
    1996 Volume 7 Pages 216-217
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    ALIS (Advanced Life science Information Systems) is dedicated to supporting and encouraging large scale human genome research by creating and distributing databases and providing the computing environment. We report on the primary status of ALIS project and our WWW service site (http://www-alis.tokyo.jst-c.go.jp). The primary stage of the project has three aspects: large-scale human genome sequencing, construction an integrated human genome database and development of supporting function for the database.
    Download PDF (187K)
  • T. Okazaki, M. Kaizawa, H. Mizushima
    1996 Volume 7 Pages 218-219
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    D. Gohsh of National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institute of Health originally maintained ‘TFD (Transcription Fac-tor Database)’ from 1990. As NCBI stopped its maintenance since 1993, we started a new database, TFDB (Transcription Factor Data Base), to take over some parts of the database focusing to the DNA binding sequence data. To update the database with recent data, we developed system which search literature database exhaustively and extract re-lated information from the abstracts of collectedarticles. We also developed mail server to search target sequence of transcription factor using this database.
    Download PDF (161K)
  • F. Lisacek, N. El Mabrouk
    1996 Volume 7 Pages 220-221
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (191K)
  • Aki HASEGAWA, Yasuo UEMURA, Satoshi KOBAYASHI, Takashi YOKOMORI
    1996 Volume 7 Pages 222-223
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    It is of great significance to develop an efficient software system for higher-level structural prediction in RNA/protein sequences. Speaking of RNA secondary structure prediction, it is inevitably required that a prediction system must have an ability to deal with so-called “pseudoknot” structures, one of the most typical and important constructs found in vivo, while no effective system is yet reported for predicting RNA secondary structures involving in pseudoknots.
    We are developing prediction systems for RNA secondary structures thatcan handle pseudo-knots in an elegant manner, where the developing systems are constructed based on the following two ways.
    Download PDF (170K)
  • A Solvent Effect Model Based on the Evaluation of Solvent-Accessible Surface Area and Generalized Born Equation
    Valentin Gogonea, Camelia Baleanu-Gogonea, Carlos A. Del Carpio
    1996 Volume 7 Pages 224-225
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    This is the second of a series of articles describing our system for prediction of protein con-formation in solution. Here we propose a force field for studyingprotein folding in solution. Our force field is made up of an internal force field (MM2) and a solvent force field which sums up the constrains that solvent imposes to protein structure in solution, as compared with the gas phase.
    Download PDF (219K)
  • Structural Motifs Enconded in the DNA?
    Carlos A. Del Carpio, Valentin Gogonea, Katsuhisa Yamaguchi, Makoto Ta ...
    1996 Volume 7 Pages 226-227
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Experimental evidence implying that complementary DNA strands encode amino acidswhich exhibit complementary hydrophobic characteristics has led us to the inspection of sense-antisense homology in several hundreds of proteins recorded in the PDB. We present here partial results of this analysis which relate localized peculiar structural characteristics of proteins to the senseantisense homology boxes found in the primary sequences. A further analysis is performed in order to determine whether these sense-antisense homology boxes, if existent within the protein, are encoded by unique sequences of codons in the DNA. We also make here a progress report about the methodology and the results obtained so far.
    Download PDF (216K)
  • A. Ogiwara, N. Ogasawara, M. Watanabe, T. Takagi
    1996 Volume 7 Pages 228-229
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (2581K)
  • Sivasundaram Suharnan, Takeshi Itoh, Hidemi Watanabe, Jun-ichi Takeda, ...
    1996 Volume 7 Pages 230-231
    Published: 1996
    Released on J-STAGE: July 11, 2011
    JOURNAL FREE ACCESS
    Download PDF (181K)
feedback
Top