Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 2B1-OS-41d-03
Conference information

Language Embedded 3D Gaussians at City-Scale for Geography-Aware Visual Programming
*Shunsuke YASUKITaiki MIYANISHINakamasa INOUEShuhei KURITAKoya SAKAMOTODaichi AZUMAJungdae LEEMasato TAKIYutaka MATSUO
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We propose GeoProg3D, a visual programming framework that enables natural language interaction with city-scale 3D scenes. GeoProg3D controls two important innovations that we introduce: Geography-aware City-scale 3D Language Field (GCLF) and Geographical Vision APIs (GV-APIs). GCLF extends language fields to city-scale 3D data, allowing precise queries based on geographic information. GV-API provides specialized geographical vision processing tools such as segmentation and object detection. GeoProg3D constructs executable programs by dynamically composing GCLF and GV-API components, resulting in accurate geographic inference. To evaluate this approach, we introduce GeoEval3D dataset, which contains 952 query-answer pairs for five challenging geographical vision tasks: grounding, spatial reasoning, comparison, counting, and measurement. Experimental results show that GeoProg3D outperforms existing models on a variety of geographic vision tasks. This framework is expected to be applied to urban planning, disaster response, environmental monitoring, and other fields.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top