Article ID: 2024LHP0003
In order to achieve a compact robot vision system that can perform object identification in real environments exposed to large changes in illumination, it is essential to consider both the algorithm and the hardware architecture. In this study, we have developed a compact and low-power image recognition system that is robust to illumination changes, consisting of a CMOS image sensor, field-programmable gate array (FPGA), and graphics processing unit (GPU) for embedded devices. To minimize the effects of changes in the illumination, our system uses the center/surround (C/S) retinex model, which is a color constancy model of the visual nervous system. Since the C/S retinex model involves large-scale spatial filtering with high computational costs, we compared the processing speeds of different hardware implementations of this model. This comparison showed that the FPGA implementation of the C/S retinex model used in this study is more than 10 times faster than the GPU implementation. Using the output of this efficiently processed C/S retinex model, a relatively small convolutional neural network (CNN), which runs on a GPU for embedded devices, performs object classification. We also investigated the impact of the spatial parameters of the C/S retinex model on the classification accuracy of CNNs using a dataset of images acquired under various lighting conditions. This investigation revealed parameters that provide better classification accuracy, and these parameters were largely independent of the CNN architecture used. This system performed object classification under various illumination colors at 52.1 frames per second with a power consumption of approximately 10.9 W.