Accurate recognition of cattle behavior, particularly mounting behavior, plays a crucial role in livestock management, as it aids in detecting estrus and optimizing herd productivity. Traditional behavior recognition methods are often labor- intensive and time-consuming, requiring manual intervention and reliance on predefined object categories. In this paper, we present an advanced multimodal system for black cattle mounting behavior analysis, which leverages YOLO-World, a cutting-edge object detection framework enhanced with open-vocabulary capabilities. While YOLO-World has been developed to overcome the limitations of traditional YOLO models, such as their dependence on predefined categories, our contribution lies in integrating this model into a streamlined pipeline for end-to-end detection, tracking, and action recognition of black cattle. We utilize YOLO-World's ability to perform open-vocabulary detection, allowing the system to adapt to new and unseen objects and behaviors without manual categorization. Additionally, our system is optimized for deployment on IoT edge devices, enabling real-time cattle monitoring in field conditions. This approach significantly reduces the time and effort required compared to traditional methods. Experimental results demonstrate improved accuracy in mounting behavior recognition, though challenges such as false positives persist. Overall, this work presents a scalable, efficient solution for real-time cattle behavior analysis in open-world environments, contributing to advancements in automated livestock monitoring.
View full abstract