2025 Volume 68 Issue 6 Pages 333-337
Data analysis based on machine learning using a set of spectroscopically obtained data was reviewed. In general, the number of spectral variables, such as discretized wavelength and wavenumber, tends to be large compared to the number of obtained spectra. Therefore, in order to reduce the number of explanatory variables to be smaller than the number of samples, dimensionality reduction is used to compress the spectral dataset. A practical application of dimensionality reduction by principal component analysis (PCA) using a set of attenuated total reflection infrared (ATR-IR) spectra of Japanese paper is introduced. Also, practical demonstrations of spectroscopic regression by partial least squares (PLS) and classification by support vector machine (SVM) using a set of near-infrared spectra of grapes, which hardly show remarkable peaks for the quantitative determination and qualitative classification, are reported.