-
Louxin Zhang
1996 Volume 7 Pages
1-12
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
A conjecture of Mirkin, Muchnik and Smith is answered affirmatively which connects the inconsistency function, a biologically meaningful dissimilarity measure for a gene and species tree, to the mutation cost function, a combinatorial measure based on mapping of trees. A linear-time algorithm for computing the inconsistency function is also derived from the conjecture.
View full abstract
-
sorting signed permutations by reversals and transpositions
Qian-Ping Gu, Shietung Peng, Hal Sudborough
1996 Volume 7 Pages
13-22
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Recently, a new approach to analyze genomes evolving was proposed which is based on comparison of gene orders versus traditional comparison of DNA sequences (Sankoff et al, 1992). The approach is based on the global rearrangements (e.g., inversions and transpositions of fragments). Analysis of genomes evolving by inversions and transpositions leads to a combinatorial problem of sorting by reversals and transpositions, i. e., sorting of a permutation using reversals and transpositions of arbitrary fragments. The problem is conjectured as NP-hard. We study sorting of signed permutations by reversals and transpositions, a problem which adequately models genome rearrangements, as the genes in DNA are oriented. We establish a lower bound and give two algorithms for the problem. Based on the lower bound, we show that the first algorithm is a 2-approximation algorithm. The time complexity of the algorithm may not be bounded by Poly (n), where n the length of the permutation to be sorted. Setting a time limit to the first algorithm, we get the second algorithm which is a 2 (1+1/k)-approximation one, where k ≥ 3 is any fixed integer, and runs in Poly (n) time.
View full abstract
-
H. Matsuda, T. Ishihara, A. Hashimoto
1996 Volume 7 Pages
23-32
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper presents a method for clustering a large and mixed set of uncharacterized sequences provided by genome projects. As the measure of the clustering, we use a fast approximation of sequence similarity (FASTA score). However, in the case to detect similarity between two sequences that are much diverged in evolutionary process, FASTA sometimes underestimates the similarity compared to the rigorous Smith-Waterman algorithm. Also the distance derived from the similarity score may not be metric since the triangle inequality may not hold when the sequences have multi-domain structure. To cope with these problems, we introduce a new graph structure called p-quasi complete graph for describing a cluster of sequences with a confidence measure. We prove that a restricted version of the p-quasi complete graph problem (given a positive integer k, whether a graph contains a 0.5-quasi complete subgraph of which size≥k or not) is NP-complete. Thus we present the outline of an approximation algorithm for clustering a set of sequences into subsets corresponding to p-quasi complete graphs. The effectiveness of our method is demonstrated by the result of clustering Escherichia coli protein sequences by our method.
View full abstract
-
Fei Shi, Peter Widmayer
1996 Volume 7 Pages
33-40
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We are given a finite set S of text strings and a pattern P over some fixed alphabet Σ. The topic of this paper is the design of a data structure D (S) which supports approximate multiple string searching queries efficiently. Thereby, for a given upper bound k ∈ Z
+ on the allowable distance, P=p
1...p
m is said to appear approximately in a text T=t
1...t
n, m, n ∈ Z
+, if there exist positions u, v in T such that the edit distance between P and t
u...t
v is at most k. Let N denote the sum of the lengths of all strings in S. We present an algorithm that constructs the data structure D (S) in O (N) time and space. Afterwards, an approximate multiple string search query can be answered in 0 (N) expected-time if the allowable distance k is bounded above by 0 (m/log m). The method can be used to search large nucleotide and amino acid sequence databases for similar sequences.
View full abstract
-
Tetsuo Shibuya, Hiroshi Imai
1996 Volume 7 Pages
41-50
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
The alignment problem of DNA or protein sequences is very applicable and important in various fields of molecular biology. In this problem, the obtained optimal solution with fixed parameters (gap penalties, weights for weighted multiple alignment problems, and so on) is not always the biologically best alignment. Thus, it is required to vary parameters and check the varying optimal alignments. The way to vary parameters has been studied well on the problem of only two sequences [6, 7, 12, 13, 14, 15], but not in the multiple alignment problem because of the difficulty of computing the optimal solution. This paper presents techniques for parametric multiple alignment problem, and examines the features of obtained alignments by parametric analysis on gap penalty and weight matrix through experiments. These experiments reveal the importance of adopting appropriate parameter values to obtain meaningful multiple alignments.
View full abstract
-
Michiyo Yamaguchi, Shinichi Shimozono, Takeshi Shinohara
1996 Volume 7 Pages
51-60
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We propose a learning algorithm that discovers a motif represented by patterns and an alphabet indexing from biosequences. From only positive examples with the help of an alphabet indexing, the algorithm finds k regular patterns as a
k-minimal multiple generalization (
k-mmg for short). The computational results for transmembrane domains indicate that the combination of
k-mmg and alphabet indexing works quite successful. We also introduce a partial alphabet indexing that transforms symbols dependently on the position in sequences.
View full abstract
-
Comparison of the Diversity Among Bacteria and Prediction of the Protein Production Levels in Cells
Shigehiko Kanaya, Yoshihiro Kudo, Shinya Suzuki, Toshimichi Ikemura
1996 Volume 7 Pages
61-71
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
In the present study, we have developed the procedure for estimating species-specific heterogeneous codon usage among intraspecific genes called diversity in codon usage and for systematizing species by the species-specific diversity on the basis of principal component analysis. We tried to quantify differences of the diversity among five species, Escherichia coli (Ec), Salmonella typhimurium (St), Haemophilus influenzae (Hi), Bacillus subtilis (Bs), and Synechocystis sp.(Ss). In the five species, many of genes involved in the translation process and energy metabolism had positive values (Z
1>0) on the first principal component (PC1). In Ss, many of genes involved in photosynthetic system had also postive Z
1-values. These genes are thought to be highly expressed. By the direction of PC1, the five species were roughly classified into three categories, [Ec, St, Hi], [Ss], [Bs]. The dendrogram constructed was roughly consistent with the rRNA-based phylogeny, but interesting differences were also observed between the two phylogenic trees.
View full abstract
-
Kenta Nakai
1996 Volume 7 Pages
72-81
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Since signal peptides play a crucial role for specifying the in-vivo fate of proteins, prediction of their existence is important for the characterization of ORFs of unknown function. To make such predictions as reliable as possible, the features of signal peptides of two important model organisms, Saccharomyces cerevisiae and Bacillus subtilis, were examined and the accuracy of current prediction methods was refined using these data. Direct optimization of the threshold values of existing methods significantly raised the predictability but the variables that were most effective for improvement were different in these two organisms. In yeast, the maximum hydrophobicity value of an 8-residue segment mainly contributed to raising the predictability to 98.5% when estimated by the cross validation procedure. In Bacillus species, the length of uncharged segment and the charges in the N-terminal region (net charge and negative charge) were combined to give a prediction accuracy of 98.2% although the data size was relatively small in this case.
View full abstract
-
M. S. Gelfand, T. V. Astakhova, M. A. Roytberg
1996 Volume 7 Pages
82-87
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Since absolutely reliable recognition of protein-coding regions in eukaryote genomic DNA sequences by computational methods is unattainable, most existing algorithms try to keep some balance between underprediction and overprediction. However, in experimental practice it is often sufficient to have just a few protein-coding segments, but predicted with high specificity, that is, with (almost) no overprediction. Such predictions are then used for construction of oligonucleotide probes and PCR primers for analysis of cDNA libraries or total cellular RNA.
Here we present a combinatorial algorithm solving this problem. Unlike other prediction schemes, the algorithm uses only the simplest statistical parameters (codon usage and positional nucleotide sequences in splicing sites) and thus can be used for analysis of obscure genomes, when large learning sets are unavailable. The algorithm's structure allows one to simply tune it for various experimental settings.
View full abstract
-
Kiyoshi Asai, Tetsushi Yada, Katunobu Itou
1996 Volume 7 Pages
88-97
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
A new method for combining protein motif dictionary to gene finding system is proposed. The system consists of Hidden Markov Models (HMMs) and a dictionary. The HMMs represents the nucleotide acid bases, the codons, and the amino acids. The ‘words’ in the dictionary is described by the sequence of these HMMs and represent the noncoding regions, the codons, protein motifs, tRNA regions and signals in DNA sequences. The statistics between these regions are expressed by the “grammar”, which is a stochastic network of the ‘words’.
Using the same kind of technique of speech recognition by HMMs with a word dictionary and a grammar, the stochastic network of ‘words’ enables the motif dictionary to be used during the parsing of the DNA sequences. At the same time, the information of the di-codon statistics, which are known as the important parameters, is included in the stochastic network. As a result, while the system parses DNA sequences and finds the coding regions, the protein motifs are automatically annotated in the regions. It helps to identify the functions of the genes and reduces the cost of homology search for each hypothetical coding regions. This method is different from simply using the the information of homology search. This method uses the information of the motif patterns during the parsing process, but searching the motif patterns after/before finding the coding regions cannot directly affect the parsing process itself. Experimental results have shown that this method correctly finds and annotates the motifs in the coding regions in the DNA sequence of cyanobacterium.
View full abstract
-
Naohiro Furukawa, Satoshi Matsumoto, Ayumi Shinohara, Takayoshi Shouda ...
1996 Volume 7 Pages
98-107
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We developed a machine learning system HAKKE which is suitable for predicting functional regions from sequences, such as protein-coding region prediction, and transmembrane domain prediction. HAKKE is a hybrid system cooperated by a number of algorithms of a pool to make an accurate prediction. The system uses an extension of the weighted majority algorithm in order to fit the strength of each algorithm into given training examples. In this paper, we describe the core of the system and show some experimental results on transmembrane domain and a-helix predictions.
View full abstract
-
Carlos A. Del Carpio, Valentin Gogonea
1996 Volume 7 Pages
108-118
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This article is the first of a series of papers describing the development of an automatic system for prediction of the three dimensional conformation of proteins in solution. In this first part we discuss the implementation of the protein conformational space mapping engine. This is a procedure based on a robust parallel genetic algorithm which runs on a network of transputers. We describe aspects of the algorithm related to the major factors that influence the protein folding process and describe their implementation within the scheme of the evolutionary algorithm. Among them, we make a throughout review of the co-operativity of emergent partial secondary structures as the evolutionary process proceeds and its effects on the stability of new generated conformers as well as a better performance of the GA. We then undertake the hydrogen bond and synthesize the demographic trends in known proteins suggested by Stickle et. al., and also implement them as an index of goodness assessment of the generations of protein conformers. Finally, we make an intensive analysis of the packing of the amino acid side chains and show how a hybrid algorithm can utter a relaxation of the perturbations brought about by the operations of the GA, and the genuine improvement of the overall process. In the second paper of this series we propose guidelines under which we implement the solvent effect which in concourse with the above mentioned factors results in a system for protein 3D structure prediction in solution.
View full abstract
-
Nickolai N. Alexandrov, Victor V. Solovyev
1996 Volume 7 Pages
119-127
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Hydrophobic long-range interactions and local polypeptide chain propensities are the major factors directing protein folding. Incorporating both these terms in addition to the Dayhoff matrix helps us to increase quality of protein fold recognition via sequencestructure alignment. We have shown that the results of secondary structure prediction substantially increase a sensitivity of the fold recognition. To measure a performance of the protein fold recognition, we have developed a comprehensive test along with a set of the quality control scores based on the most populated structural families. With this test we have demonstrated improvement of the sequence alignment with consideration of the predicted secondary structure, even without knowledge of the real three-dimensional structure.
View full abstract
-
Hiroyuki Ogata, Wataru Fujibuchi, Hidemasa Bono, Susumu Goto, Minoru K ...
1996 Volume 7 Pages
128-136
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
In conjunction with a new database system that efficiently organizes the metabolic pathway data from various organisms, we are developing computational methodologies using binary relations and hierarchies of enzymes. Biological knowledge integrated in the system includes genes, gene products, chemical compounds, enzyme reactions and metabolic pathway diagrams. By automatically mapping the enzymes of a specific organism on the pathway diagrams, it becomes possible to visualize the characteristic features of the organismspecific metabolic pathways. With the aid of the computational methodology implemented in the system, it becomes again possible to analyze and investigate the pathways in terms of their function and evolution. In this paper, we describe the outline of the system and present new biological features of metabolic pathways revealed by the system.
View full abstract
-
Yasuhiko Kitamura, Tetsuya Nozaki, Hideyuki Nakanishi, Teruhisa Miura, ...
1996 Volume 7 Pages
137-146
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
With the advance of the Human Genome Project, a huge amount of various genome data has been stored in a number of databases and the WWW system is widely used to access these databases. From the viewpoint of information supplier, the WWW is a quite useful tool to provide various types of data easily, but from the viewpoint of information consumer, it is not good enough because of lack of rigid data format and difficulty of data access. In this paper, by extending a current WWW browser, we propose two generic WWW tools; MetaViewer and MetaCommander, and try to apply them to the genome informatics to support researchers who search, analyze, and dispatch genome data, and discuss their potential advantages from the viewpoint of information consumer.
View full abstract
-
R. Gras, J. Nicolas
1996 Volume 7 Pages
147-156
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We present a new tool, FOREST, aiming at representing the content of a large nucleic acid sequence (e.g.> 100KB) in a suitable form for the biologist. More precisely, FOREST builds all subsequences repeated in a sequence or a set of sequences. It allows not only to look for the location of the various occurrences of a given subsequence but points also to interesting subsequences with respect to a given criterion. This tool is based on two key ideas. The first idea consists to build a suffix-tree representation of a sequence and to associate to each node of this tree a set of synthesized attributes, computed on the set of subsequences under this node. This allows the biologist to “browse” in the sequence with a constant abstract view of what he may expect to find in the section of the tree he is currently investigating. The second idea consists to summarize the distribution of the information with boolean vectors associated to the sequence. These vectors may be easily displayed in form of a linear map of events, as it is done in genetic mapping. Both representations allow various efficient operations on the sequence. They provide a powerful filtering capacity of the data, while reducing the set of elementary filtering operations to a minimum of conceptual operations. This allows the biologist to easily investigate the most prominent features of the lexical structure of its sequences.
View full abstract
-
T. Koike, T. Okayama, J. Ishii, T. Mizunuma, T. Tamura, Y. Tateno, H. ...
1996 Volume 7 Pages
157-165
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
As the molecular biology has made a rapid progress these years, there has been a great number of changes required of the methodology for maintaining and utilizing DNA sequence data. For example, annotation to sequences has become complex and extensive. DDBJ which recognized the impending requirements decided to develop a new DNA sequence database system in 1995. To tolerate with frequent changes of the data structures and significant increment of the data in terms of quality and quantity, we designed a completely new database schema. In the new system, physical changes of the data structure do not affect such applications as a tool for annotation. We also designed a new annotation tool with object oriented concept that allows us to handle DNA sequence data in computers as intuitively as in the real world. The annotation tool is named as YAMATO II. We also take care of needs from DDBJ itself in the new system. Data traffics and security in the database access are especially analyzed and reviewers of data for DDBJ who are distant from DDBJ are now able to process the data safely and comfortably in the new system. The new system also realized more robust and effective data exchange with partners in the international nucleotide sequence banks, EMBL and GenBank.
View full abstract
-
Hajime Kitakami, Yasuma Mori, Masatoshi Arikawa
1996 Volume 7 Pages
166-167
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We developed a taxonomy database system for managing multimedia contents. The system is accessible from remote users through World-Wide Web and is implemented in SQL programming and CGI (Common Gateway Interface) scripts of World-Wide Web.
View full abstract
-
Hiroaki Kato, Yoshimasa Takahashi
1996 Volume 7 Pages
168-169
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper describes an approach to automated identification of three-dimensional (3-D) motif in proteins. Here, the structure of a protein was reduced into abstract representation which consists of the α-helix and β-strand secondary structure elements, these being described by vectors in 3-D space rather than the point-like atoms that are used in the simple Ca approximation. The algorithms and the implementations are discussed with a couple of execution examples of the identification of the 3-D motif candidates using well known motifs.
View full abstract
-
Takayuki Kamei, Yasuo Yonezawa
1996 Volume 7 Pages
170-171
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
It is well-known that enzymes is very important as reaction factor in life systems activity. But the properties based on information theory are not yet enough in biological studies. Then, we examined correlation the complexity at amino acid sequences with its function of Enzymes by informational measure, in order to elucidate the informational properties of sequence structure. Also, power spectrum of enzyme complexity are obtained specific profile by Fourier Transform (FT) method. At results, correlation at sequence complexity, the sequence of enzyme Proteins are given complexity more than non-enzyme Proteins. Moreover, FT profile are given typical pattern at complexity of enzyme Protein sequences. This result are suggested that the new view-point for Protein analysis by information Science.
View full abstract
-
Y. Wada, H. Yasue
1996 Volume 7 Pages
172-173
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed a new version of the Animal Genome Database for network users using Java applets. A new version has included linkage homology map, Java version of clickable linkage map, Japanese tutorial with audio clip. Furthermore, we have started the Mouse Genome Informatics mirror site in Japan.
View full abstract
-
A. Nakaya, A. Yonezawa, K. Yamamoto
1996 Volume 7 Pages
174-175
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Masamichi Isokawa, Masato Wayama, Toshio Shimizu
1996 Volume 7 Pages
176-177
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Takashi Ishikawa, Shigeki Mitaku, Takao Terano, Makiko Suwa, Takatsugu ...
1996 Volume 7 Pages
178-179
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This research develops a method for discovering functional sites of amino acid sequences using an Inductive Logic Programming (ILP) method with sorted variable generalization. Functional sites provide clues to building a knowledge base for prediction of protein functions from amino acid sequences. The proposed method generates hypotheses of functional sites directly from aligned amino acid sequences using an ILP method extended with sorted variable generalization. The proposed method is shown to be useful for discovering functional sites by an example application to the case of bacteriorhodopsin-like proteins.
View full abstract
-
Takatsugu Hirokawa, Boon-Chieng Seah, Shigeki Mitaku
1996 Volume 7 Pages
180-181
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
A new method to predict the transmembrane helices from amino acid sequences was developed, in which the effect of the stabilization of helices by interhelix binding was taken into account. It was assumed that there are three stages of transmembrane helix conformation: the binding to membrane surface, the formation of transmembrane core region, and the maturation of helix due to the tertiary structure formation in membrane. This method was applied to the amino acid sequences of membrane proteins whose number of transmembrane helix are given, and most transmembrane helices were truly predicted.
View full abstract
-
Takanori Washio, Masahiko Wada, Masaru Tomita
1996 Volume 7 Pages
182-183
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Yasuhiro Asakawa, Masaru Tomita
1996 Volume 7 Pages
184-185
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
An interesting correlation between G+C contents and the lengths of primate introns have beenfound by our computer analysis.
All sequences of primate introns were extracted from the Genbank database and classifiedinto subgroups according to their lengths (the number of bases; increment of 100). G+C contents (%) were then calculated for each subgroup.
The results indicate that shorter introns tend to contain more G and C nucleotides, andlonger introns contain A and T nucleotides.
Frequencies of each nucleotide for each subgroup are shown in figure 1.
We also computed G+C contents of exons flanking those introns for each subgroup. As wecan see in figure 2, the similar but weaker tendencies are observed.
Biological significance of those observations is currently under investigation. We also intendto extend our analysis to other eukaryotes.
View full abstract
-
Tom Shimizu, Kouichi Takahashi, Masaru Tomita
1996 Volume 7 Pages
186-187
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Rintaro Saito, Hidekazu Sasaki, Yuko Osada, Masaru Tomita
1996 Volume 7 Pages
188-189
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Yuko Osada, Ryo Matsushima, Masaru Tomita
1996 Volume 7 Pages
190-191
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Michiko Muraki, Masaru Tomita
1996 Volume 7 Pages
192-193
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Yoshimi Toda, Masaru Tomita
1996 Volume 7 Pages
194-195
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
As been described above, Alu subfamily classification, direct repeats, and poly-a tails can be used as markers to refine sequence analysis and infer history of duplication events with high degree of confidence.
View full abstract
-
Y. Fujiwara, M. Asogawa
1996 Volume 7 Pages
196-197
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper examines a method to normalize a score of a stochastic motif, represented by a hidden Markov model (HMM). The accuracy of the Z score method, which is one ofthe score normalization method, is compared with that of the whole search method.
View full abstract
-
Minoru Asogawa
1996 Volume 7 Pages
198-199
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
K. Nakata, T. Igarashi, M. Hayakawa, T. Kaminuma
1996 Volume 7 Pages
200-201
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed a database of receptors, which gather data from information sources on the Internet. The source of this database is a variety of genomic and biological information on the internet; PIR, Swiss Prot, PDB, GenBank, EMBL, GDB, etc…The system provides the detail structure and functional information on receptors, such as ligand binding site and DNA binding site, which were picked up from the references, and the three dimensional structures. The system was implemented on the unix workstation (IRIS, INDIGO 2), using an object oriented database management system ACEDB (A Caenorhabditis elegans Data Base).
ACEDB is an object oriented database management system, which has been developed as part of the Caenorhabditis elegans genome research. This database is a generalized genome database, and can be used to create new database without the need for any reprogramming or in fact any sophisticated computer skills.
The system provides various viewing tools that effectively display different types of receptor data; DNA sequences, amino acids sequences, DNA binding sites, ligand binding sites, gene and disease information, and the protein structural information. It can also display three dimensional structure of molecules using a freeware molecular graphics RASMOL. The detail information for ligand and signal transduction, which are picked up from references, are also included. The system has also a browser interface so that database can be accessed via World Wide Web. The information regarding the sites of action on the receptor are highly interesting in biologically, medically and pharmacologically. The database may be useful for quick reference for ligand-membrane receptors and signal transduction in the drug design. We may use the database for the functional and structural analyses of receptors.
View full abstract
-
Naoko Kasahara, Keiichi Nagai, Susumu Hiraoka
1996 Volume 7 Pages
202-203
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
We have developed a method based on a dynamic programming method, that enables us directly compare DNA and amino acid sequences. This method makes it possible to find homologies between translated DNA sequences and amino acid sequences by recognizing gaps in both types of sequences. This method allows higher sensitivity and specificity than possible with BLASTX, which has a similar function. To reduce the computation time, we performed a parallel computation on a workstation cluster using a PVM (Parallel Virtual Machine) programming.
View full abstract
-
Hikaru Yamamoto, Takuro Tamura, Katsumi Isono, Takashi Gojobori, Hidea ...
1996 Volume 7 Pages
204-205
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Hiroaki Inayoshi, Hitoshi Iba
1996 Volume 7 Pages
206-207
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This paper describes a model of the artificial chemical world and its computer simulation, in which rhythms emerge. The model specifies four items of the artificial chemical world:(1) components (five kinds of particles and DNA having Genetic Switches);(2) space (2-dimensional polar grids);(3) simple reaction rules (construction and destruction of molecules, etc.);(4) simple behavioral rules (stochastic movements and stochastic collisions, etc.); The simulation demonstrates the capability of the system to exhibit emergent behavior: that is, global order of the system (regular rhythms in this case) emerges out of randomness (thorough stochastic movements and collisions) of its components.
View full abstract
-
Yutaka Ueno, Kiyoshi Asai
1996 Volume 7 Pages
208-209
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Studies of interaction between protein molecules sometimes require visualizing huge numbers of atoms in a molecular graphics pictures. Namba et al.[1] has reported that simplification and enhancement makes molecular pictures informative in their structural study of protein and nucleic acid in the tabaco mosaic virus. Their pictures have boundary outlines to distinguish different monomers which are symmetrically packed to form the virus, but it is not accomplished by currently available molecular graphics softwares. As novel research in structure biology increased, we will need more functions for graphics software to meet our biological interest. However most software are hard to modify and not expected to be improved on a specific request.
A new software development project of an extensible protein visualization program for structure analysis and prediction study has started for this demand. Our goal is to provide a software platform which runs on common hardware and allow users to add new functions with average programming skill. Our first version is a structure viewer program of proteins in PDB database.
In this project, an application supporting library was designed together with a target program to lead clear prospect of the complicated programming. Among number of technical issues for building a graphics software, 3d-graphics library and memory management functions are redesigned for fast drawing of large number of atoms. An original plug-in module function and a graphical user interface tool kit is also designed. This plug-in module was implemented by dynamic linking system calls in Unix system. The program can be configured with necessary modules from numbers of viewing and analysis functions for the software which we will develop eventually. Also a special calculation function using atomic coordinate data can be added by writing a new plug-in module. In contrast, macro language has been used in some systems, it never be faster and powerful than a binary code of plug-in module. A robust module interface design is now revised.
Prototyping has completed on Unix with X-Window system. This first version has basic protein visualization features, such as several molecular model representation, rotation and two new features: 1) boundary outline to distinguish different molecules; 2) amino acid sequence windows are linked to 3-dimensional viewing window of the protein, where a selection echo is shown also in another window. It gives us a nice tracking view of peptide chains on navigating large proteins. Several examples of protein pictures made by this prototype will be presented in poster: a molecular interaction study of muscle proteins. Actin (45kD) and Myosin (head sub fragment S-1: 120kD) which are known to interact to generate force. Actin forms a filament in muscle, so several Actin monomer should be drawn, and one or two Myosin would interact in a picture. This case will be more than 4000 of alpha-carbons.
Our program was written in C with Xlib and ordinary libraries and going to be released forUnix systems. Versions for personal computers are also planed to take advantage of their high potential in hardware.
View full abstract
-
Tetsushi Yada, Yasushi Totoki, Masato Ishikawa, Kiyoshi Asai
1996 Volume 7 Pages
210-211
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Makoto Hirosawa, Tetsushi Yada
1996 Volume 7 Pages
212-213
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Yasukazu Nakamura, Nobuyuki Miyajima, Makoto Hirosawa, Takakazu Kaneko ...
1996 Volume 7 Pages
214-215
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
As a Part of ALIS
Mika HIRAKAWA, Kensaku IMAI, Akira OHYAMA, Fumihiko KIKUCHI
1996 Volume 7 Pages
216-217
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
ALIS (Advanced Life science Information Systems) is dedicated to supporting and encouraging large scale human genome research by creating and distributing databases and providing the computing environment. We report on the primary status of ALIS project and our WWW service site (http://www-alis.tokyo.jst-c.go.jp). The primary stage of the project has three aspects: large-scale human genome sequencing, construction an integrated human genome database and development of supporting function for the database.
View full abstract
-
T. Okazaki, M. Kaizawa, H. Mizushima
1996 Volume 7 Pages
218-219
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
D. Gohsh of National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institute of Health originally maintained ‘TFD (Transcription Fac-tor Database)’ from 1990. As NCBI stopped its maintenance since 1993, we started a new database, TFDB (Transcription Factor Data Base), to take over some parts of the database focusing to the DNA binding sequence data. To update the database with recent data, we developed system which search literature database exhaustively and extract re-lated information from the abstracts of collectedarticles. We also developed mail server to search target sequence of transcription factor using this database.
View full abstract
-
F. Lisacek, N. El Mabrouk
1996 Volume 7 Pages
220-221
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Aki HASEGAWA, Yasuo UEMURA, Satoshi KOBAYASHI, Takashi YOKOMORI
1996 Volume 7 Pages
222-223
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
It is of great significance to develop an efficient software system for higher-level structural prediction in RNA/protein sequences. Speaking of RNA secondary structure prediction, it is inevitably required that a prediction system must have an ability to deal with so-called “pseudoknot” structures, one of the most typical and important constructs found in vivo, while no effective system is yet reported for predicting RNA secondary structures involving in pseudoknots.
We are developing prediction systems for RNA secondary structures thatcan handle pseudo-knots in an elegant manner, where the developing systems are constructed based on the following two ways.
View full abstract
-
A Solvent Effect Model Based on the Evaluation of Solvent-Accessible Surface Area and Generalized Born Equation
Valentin Gogonea, Camelia Baleanu-Gogonea, Carlos A. Del Carpio
1996 Volume 7 Pages
224-225
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
This is the second of a series of articles describing our system for prediction of protein con-formation in solution. Here we propose a force field for studyingprotein folding in solution. Our force field is made up of an internal force field (MM2) and a solvent force field which sums up the constrains that solvent imposes to protein structure in solution, as compared with the gas phase.
View full abstract
-
Structural Motifs Enconded in the DNA?
Carlos A. Del Carpio, Valentin Gogonea, Katsuhisa Yamaguchi, Makoto Ta ...
1996 Volume 7 Pages
226-227
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
Experimental evidence implying that complementary DNA strands encode amino acidswhich exhibit complementary hydrophobic characteristics has led us to the inspection of sense-antisense homology in several hundreds of proteins recorded in the PDB. We present here partial results of this analysis which relate localized peculiar structural characteristics of proteins to the senseantisense homology boxes found in the primary sequences. A further analysis is performed in order to determine whether these sense-antisense homology boxes, if existent within the protein, are encoded by unique sequences of codons in the DNA. We also make here a progress report about the methodology and the results obtained so far.
View full abstract
-
A. Ogiwara, N. Ogasawara, M. Watanabe, T. Takagi
1996 Volume 7 Pages
228-229
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS
-
Sivasundaram Suharnan, Takeshi Itoh, Hidemi Watanabe, Jun-ichi Takeda, ...
1996 Volume 7 Pages
230-231
Published: 1996
Released on J-STAGE: July 11, 2011
JOURNAL
FREE ACCESS