This study investigated an algorithm for estimating “Stir fried food” and “Deep fried food”, which are events that occur during cooking activities. Three peak frequencies were extracted using linear predictive analysis, a feature used in speech processing, and the sounds of “Stir fried food” and “Deep fried food” were classified using a classifier. Linear Predictive Analysis is a feature used in speech coding in the frequency range of 100-1000 Hz (for a source with a sampling frequency of 8000 Hz), but in this study, the acoustic events “Stir fried food” and “Deep fried food” (for a source with a sampling frequency of 44100 Hz), which contain high-frequency components, were pre-processed using a digital filter. In this study, we preprocessed the acoustic events “Stir fried food” and “Deep fried food” (sources with a sampling frequency of 44100 Hz), which contain high-frequency components, using a digital filter, and then obtained the spectral envelopes by setting the LPC order from 20 to 200. The peaks of the spectral envelopes were arranged in order from the lowest frequency to the highest frequency, and K nearest neighbor classifiers of each order were created as feature vectors. Using these classifiers to classify unknown sound sources, we found that when the LPC order was set between 46 and 50, the accuracy of event estimation was 0.91 to 0.94 for “Stir-fried food” and 0.82 to 0.97 for “Deep fried food”.
抄録全体を表示