Host: The Japanese Society for Artificial Intelligence
Name : 34th Annual Conference, 2020
Number : 34
Location : Online
Date : June 09, 2020 - June 12, 2020
Word embedding models show the high performance on, analogy task among other semantic tasks. The consensus on the high performance is that the inner product of word vectors created using these models approximates a co-occurrence frequency weighted by a pointwise mutual information (PMI). Thus indicating that PMI matrix has the important information for analogy task. However, this explanation is insufficient concerning high performance on analogy tasks, PMI itself is not related to analogy. To further investigate the role of co-occurrence matrix for analogy task, we conduct experiments using co-occurrence matrix weighted by log, which represents the original co-occurrence more closely. The result shows that log co-occurrence (logfreq) can be used to solve analogy task comparably to PMI matrix, and SVD applied logfreq outperforms others. The result indicates that PMI is not necessary for analogy tasks and it is important to further investigate the characteristics of the original co-occurrence matrix.