人工知能学会全国大会論文集
Online ISSN : 2758-7347
26th (2012)
セッションID: 3M2-IOS-3b-3
会議情報

On Chinese Postal Address and Associated Information Extraction
*Chia-Hui CHANGChia-Yi HUANGYueng-Sheng SU
著者情報
会議録・要旨集 フリー

詳細
抄録

Address information is closely linked to people's daily life. People often need to query addresses of shopping malls, schools, and organization, and use the map service of map marking to make sure reality location. MapMarker is a service, which extracts English postal addresses from general web pages and marks them with associated information on map. This paper extends the idea to Chinese postal addresses extraction on the Web and improves the extraction of associated information for each address with hierarchical clustering. We show how to prepare the data for training and conduct full address extraction using both BIEO and IO tagging methods. We compare the difference with and without Yahoo Chinese word segmentation. The results show that Chinese postal addresses can be extracted with high F-measure 0.97 using BIEO tagging without word segmentation since incorrect segmentation can lead to worse labeling of address tokens. Meanwhile, associated information for each address is also identified based on clustering of the addresses into address blocks. The F-measure is improved to 0.92 from 0.90.

著者関連情報
© 2012 The Japanese Society for Artificial Intelligence
前の記事 次の記事
feedback
Top