Host: The Japanese Society for Artificial Intelligence
Name : The 38th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 38
Location : [in Japanese]
Date : May 28, 2024 - May 31, 2024
This paper introduces a novel approach for extracting named entities as outliers from press-release texts using anomaly detection techniques and validated its effectiveness and potential applicability in company research. This study used the local outlier factor (LOF), a data density-based anomaly detection technique known for its robust performance even in high-dimensional spaces. Specifically, this approach initially uses pretrained FastText on the entire press-release texts to convert nouns into vectors, leveraging FastText’s adaptability to unknown words. Subsequently, these vectors are fed into LOF to detect outliers. Results showed that the proposed method successfully extracted eight types of named entities, as defined by IREX, as outliers in the experiments. However, among the identified outliers, several words deviated from the defined criteria of named entities and noise was present in the output.