2025 Volume 6 Issue 3 Pages 380-392
This paper proposes a method for predicting event locations from urgent call audio to assist road information collection operations. Operators are required to identify event locations by matching information verbally conveyed by callers with geographical information of their managed areas. To support this work, we construct a framework for geolocalization from emergency call audio, which is expected to reduce operator burden and improve operational efficiency. Our research addresses two key challenges: real-time processing and accurate place name recognition. We achieve real-time performance through automatic speech recognition with conversation summary retention, enabling incremental location prediction during ongoing calls. To improve place name recognition accuracy, we fine-tune speech recognition models using synthetically generated datasets containing local geographical names, as existing models struggle with location-specific vocabulary. We evaluate the effectiveness of our proposed method using actual emergency call data collected in operational settings.