Abstract
KNApSAcK Family DB is a set of databases associated with natural products and organisms. In the present article, we explain species-natural product relation DB, the KNApSAcK Core DB together with the current status of KNApSAcK Family DB in view of expansion of the DB which can be utilized in multifaceted scientific fields and acquisition of new knowledge based on mining techniques. Alkaloids have extremely diverged chemical structures including heterocyclic ring systems and they encompass more than 20,000 different molecules in organisms. To facilitate a systematic understanding of the species-metabolite relationship, we have developed KNApSAcK family DB. KNApSAcK Core DB has stored 116,315 metabolite-species pairs and 51,179 different metabolites. Of them, 12,460 metabolites belong to alkaloid compounds, which covered almost all plant-produced alkaloids (approximately 12,000 alkaloids). An evaluation of the numbers of alkaloids linked to different starting substances leads to information on the origin of the creation and evolution of diverged alkaloids. We applied the MGCNN model to 12,460 compounds in the KNApSAcK Core DB. A large number of alkaloids were predicted to be associated with six starting substances, i.e. L-Arg, L-Tyr, L-Pro, L-Lys, L-Asp and L-Trp. These starting substances fundamentally may contribute to create diversity of chemical structures of alkaloids.