Abstract
This research develops a center of gravity (COG)-based estimation algorithm to determine the grasping position of Japanese bento without requiring additional sensors or prior grasping trials, thereby enhancing automated production efficiency in the food industry. A large vision model (LVM) and YOLO11 are employed for initial bento detection and dish classification, followed by COG estimation using color thresholding and weight averaging based on pixel areas and dish density. Experimental validation using an actual manipulator UR3 demonstrates its effectiveness in accurately grasping. These results emphasize the importance of COG in object grasping and provide valuable insights into food production automation.