主催: The Japanese Society for Artificial Intelligence
会議名: 2024年度人工知能学会全国大会(第38回)
回次: 38
開催地: アクトシティ浜松+オンライン
開催日: 2024/05/28 - 2024/05/31
Bidders often take a long time to read and understand tender documents because they require specialized knowledge, and tender documents are generally long. Here, the function that can extract specific items (i.e., item extractor) and the function that can highlight words or phrases related to specific items (i.e., word-phrase highlighter) are in great demand. To develop such type of functions, we need to solve two problems. The first problem is the problem related to the annotated data set. The second problem concerns the BERT-based sequence labeling approach in a small training dataset setting. To solve the first problem, we created two types of sequence labeling datasets related to Item Extractor and Word-Phrase Highlighter. To solve the second problem, we propose the Information Extraction (IE) method, which combines (1) a supervised learning approach using BERT-based sequence labeling and (2) a large language model (LLM)-based improver. Experimental evaluation demonstrates the effectivenes of our approach. Moreover, as an application, We then developed the web application system called Tender Document Analyzer (TDDA).