Near-infrared spectroscopy (NIR) is widely used for non-destructive food quality check. The prediction models are constructed between NIR spectra and quality parameters. However, because of the noise included in spectra and the duplication between the peaks of the target components and those of the other components, the prediction accuracy of the models decreases. To avoid this problem, derivative spectra are used in modeling. Derivation of spectra has an effect to emphasize the small and narrow peaks so that the affection of peak duplication decreases. On the other hand, derivation of spectra also has an effect to enlarge the noise. The impacts of these effects change as the derivative changes, hence it is necessary to select the adequate derivative for each data. Besides, if there are several peaks of the target components, the adequate derivative is different for each peak. In this paper, we therefore construct regression models using the spectra, the first, second and third derivative spectra, and the combinations of them. The accuracy of the models which are constructed with different derivative spectra or the combinations of them changes when the number of the training data changes. Thus, we proposed a method to select the proper model according to the number of the training data. The selection is performed based on the prediction accuracy of each model. A simulation data set that mimics the spectra where three different peaks duplicate was analyzed using the proposed method. Then, the proposed method was applied to sugar content prediction of oranges. The results showed that the most accurate model changed as the number of the training data changed, and that the effectiveness of the proposed method was proved.
Edited and published by : Division of Chemical Information and Computer Science, The Chemical Society of Japan Produced and listed by : Division of Chemical Information and Computer Science, The Chemical Society of Japan