2022 Volume 91 Issue 1 Pages 28-38
We developed yield-estimation models based on yield trial data for the eight main-consumed rice cultivars in Japan using environmental variables, including soil available N, N fertilizers, air temperature, and solar radiation. The difference between attainable yield and farmers’ actual yield was determined by comparing the model-estimated values and the yield in the statistical yearbook of Ministry of Agriculture, Forestry and Fisheries (MAFF). Yield-estimation models were developed using three calculation methods, i.e., partial least square regression (PLS), random forests (RF), and XGBoost (XGB); and, the effect of data-cleansing treatments was evaluated. The goodness of model fitting for PLS was improved by the data-cleansing treatment, whereas that for RF and XGB was decreased. The calculation method to explain yield variation better differed with the variety. The average rank of feature importance of daily minimum temperature during the reproductive period was the highest in RF and XGB. The significant positive correlation between the model-estimated values and yield in the statistical yearbook of MAFF indicates that these models can explain the yield response to different environments, and shows that the attainable yield is 55.4–57.3 kg 10 a–1 higher than the farmers’ actual yield in Japan.