To ensure safe and efficient operations using a robot combine harvester, seven semantic segmentation models were developed for pixel-wise level detection of the objects in a rice field. These models were trained and tested on four datasets. The results showed that all models performed well on the detection of rice field images. The pixel accuracy, class mean accuracy, mean intersection over union (IoU), IoU of lodging area ,and detection accuracy of loging existence of the best model were 0.9719, 0.8801, 0.8449, 0.6933, and 0.9448, respectively. The frame rate of the best model reached 14.04 frames per second (FPS), with an image size of 640×480 pixels on an embedded processor (Jetson TX2), which was fast enough to detect the objects in the rice field images.
View full abstract