Journal of Computer Chemistry, Japan -International Edition
Online ISSN : 2189-048X
ISSN-L : 2189-048X
General Paper
Development of a Protein-Gene Motif Dictionary System for One-Stop Motif Analysis
Masahiro OHTOMOHiroaki KATO
著者情報
ジャーナル オープンアクセス HTML

2021 年 7 巻 論文ID: 2020-0008

詳細
抄録

The amino acid sequence of a protein is closely related to its structure and function. This is especially true for particular structural features called motifs, which are well-reserved sites in genome sequences. Biological data, such as the data for biopolymers, are rapidly increasing. Constructing a database for efficient analysis is important for identifying the structure and function of unknown biological data. Here, we constructed a protein-gene motif dictionary system for several model species using NoSQL, a database management system. This dictionary stored protein sequence motifs based on PROSITE, along with their corresponding mRNA sequences. Additionally, the database stored 3D structural information of the corresponding protein sequence motifs. The protein-gene dictionary has 49,265 registered entries, 120,047 sequence motifs, and 57,452 3D structural motifs from 7 model species. Software tools with graphical user interface were also developed to assist with intuitive search and analysis using the system. As a result, we discovered that zinc protease motif had co-occurrence with the cysteine switch motif. It was followed by the cysteine switch motif with a gap of 117 to 293 amino acids, however, its 3D Euclidean distance was preserved at around 12 Å.

Fullsize Image
著者関連情報
© 2021 Society of Computer Chemistry, Japan

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial No Derivatives (CC BY-NC-ND) 4.0 License.
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja
前の記事 次の記事
feedback
Top