Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 4P2-OS-17b-02
Conference information

Constraint Definition of Combinatorial Optimization Problems Using Natural Language and Visual Information: Solution Generation via a Reasoning Visual Language Model
Exploring a Non-Mathematical Approach for the Democratization of Combinatorial Optimization
*Shota INOUETakumi BANNAI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

It is challenging for combinatorial optimization specialists to fully capture complex domain knowledge and accurately formulate constraints, restricting widespread practical adoption. We evaluated the effectiveness of a reasoning vision-language model (o1) for combinatorial optimization problems, while considering practical constraints expressed in natural language and visual information. Using the Traveling Salesman Problem (TSP), we compared the solutions produced by our approach with exact solutions derived from mixed-integer linear programming and with those generated by the visual language model (GPT-4o). For standard TSP instances with N = 10–30, o1 achieved a smaller optimality gap than GPT-4o and the nearest-neighbor heuristic. Moreover, for time-window and precedence constraints expressed in natural language, GPT-4o failed to meet these constraints, whereas o1 achieved a constraint satisfaction rate exceeding 90%. Additionally, o1 complied with over 80% of the visually defined area-order constraints. These results suggest that non-experts can introduce practical constraints without relying on mathematical models.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top