2022 Volume 32 Issue 2 Pages 202-207
To utilize quantitative relationship among physical quantities represented in mathematical formula in science texts, an effort to extract mathematical formula, variables in them, and the physical meanings of the variables by computer program was made. Three methods, rule-base, parsing + rule-base, and deep learning were tried to extract physical meanings of the variables. It was revealed that rule-base methods with/without parsing require further machine learning even after rules were applied, which results in unrealistic huge tasks. Deep learning using BERT fine-tuning with 4000 – 4500 labeled data achieved more than 70 % accuracy.