Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
In the financial domain, when an investor makes an investment decision, he/she reads the necessary information from documents disclosed in accordance with the issuance of share certificates and corporate bonds. The financial disclosure documents are published in XBRL format and consists of a plurality of text blocks and tables, where the necessary information are scattered in the form of natural language. Extracting the information from disclosure documents and managing it continuously with DB is desirable.However, the cost is expensive to extract by hand because of the large number of the documents consisting of about 40 to 60 items to be required. In this manuscript, we apply the natural language processing techniques to the disclosure document and report the result of the extraction of the necessary information by pattern matching of syntax tree and table analysis.