Host: The Japanese Society for Artificial Intelligence
Name : The 39th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 39
Location : [in Japanese]
Date : May 27, 2025 - May 30, 2025
Encoders such as CNNs and Transformers can embed high-dimensional objects (e.g., images) into low-dimensional vectors via an object embedding operation, and many previous studies treat the latent space formed by these embedding vectors as a Euclidean space. In this study, I aim to capture the geometric structure of the latent space that may be overlooked under a purely Euclidean assumption. To this end, we propose a method that associates the encoder’s intermediate representations with probability distributions, thereby defining an information-geometric manifold on which we can estimate geometric quantities such as metrics and curvature. The set of distributions obtained by inputting an image dataset into the encoder forms an information-geometric manifold with the α-divergence as its distance, and its expectation coordinates coincide with the embedding vectors. Through experiments estimating the metric and curvature of the MNIST dataset learned by a CNN, we found that the latent space exhibits positive curvature in many regions, indicating that it is not necessarily flat.