Host: The Japanese Society for Artificial Intelligence
Name : The 35th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 35
Location : [in Japanese]
Date : June 08, 2021 - June 11, 2021
In this paper, we propose neural network models based on the neural ordinary differential equation (NODE) for small-footprint keyword spotting (KWS). KWS, which detects pre-defined keyword from input audio data, draws much attention as a promising technique to realize so-called “voice user interface” that can control mobile phones and smart speakers by voice. Recently, many researchers have demonstrated KWS with artificial neural networks and have achieved high inference accuracy. Voice-controlled devices are, however, usually battery-operated, and hence memory footprint and compute resources are severely restricted. To cope with this restriction, we present techniques to apply NODE to KWS that make it possible to reduce the number of parameters and computations during inference. Finally, we show that the number of model parameters of the proposed model is smaller by 68% than that of the conventional KWS model.