Host: The Japanese Society for Artificial Intelligence
Name : The 35th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 35
Location : [in Japanese]
Date : June 08, 2021 - June 11, 2021
In recent years, the demand for edge AI has been expanding from the viewpoint of real-time performance and data confidentiality. We use the QAT (Quantization Aware Training) method of TensorFlow and TensorFlow Lite to realize speed up and memory saving in edge AI. In the current situation where new AI models are being devised one after another, it is unlikely that the QAT will support all operations. Therefore, depending on the AI model used, there is a problem that speed and accuracy will decrease due to the inclusion of unsupported operations. In this paper, we will take YOLOv3-tiny, an object detection model in which such a problem occurs, as an example to propose methods for improving speed and accuracy. We were able to half the inference time on the Raspberry Pi 3 Model B+ and improve the inference accuracy to the same level as before quantization.