Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Abstract: (1) Purpose: Discovering and retrieving relevant information from lengthy documents is a challenging task, such as product defect reports, chat-histories of a call center, minutes of the meeting. Thus, constructing a technic identifying information types of each sentences in a document is important. We challenged revealing which type of Feature Engineering is effective for this task, or confirmed whether the BERT model is effective. We used Open Source Software Issue discussion as a corpus in this study, such as TensorFlow and scikit-learn. (2) Results: As a result from trained models using AutoML and calculated the global importance using SHAP, the length of sentences, the position in the document and the time between comments are important. A limited fine tuning of BERT, which means training only the parameters of the final layer, was no significant difference in the performance from ordinal logistic regression.