A program for fully automatic conversion of line plots in scientific papers into numerical data has been developed. By the conversion of image data into numerical data, users can treat so-called 'spectra' such as X-ray photoelectron spectra and optical absorption spectra in their purpose, plotting them in different ways such as inverse of wave number, subtracting them from users' data, and so forth. This article reports details of the program consisting of many parts, with several deep-learning models with different functions, elimination of literal characters, color separation, etc. Most deep-learning models achieve accuracy higher than 95%. The usability is demonstrated with some examples.
A method for interdisciplinary material search using knowledge database, materials curationⓇ, has been proposed. It enables the finding of a direction of search without numerical data from experiments or calculations. The knowledge used is a compilation of relations between materials properties. Examples of the compilation and the computer system used to search the compilation (in the form of network-type database) are demonstrated. Furthermore, a technique is under development to extract knowledge on quantitative relations from mathematical formula in literatures.
The values of the internuclear distances and the dipole moments of 14 small molecules have been estimated by machine learning with only molecular orbital energies as the explanatory variables. We use four regression methods, partial least square (PLS), random forest (RF), Radial Basis Function Kernel Regularized Least Squares (krlsRadial), and Baysian Regularized Neural Networks (BRNN) and we report only BRNN results for the internuclear distances, and PLS results for the dipole moments. The coefficients of determination for the internulear distances and the dipole moments are 0.9318 and 0.7265, respectively. It has been proved that the internuclear distances and the dipole moments can be predicted by the molecular orbital energies only.