Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 3S1-GS-2-04
Conference information

Estimating Geometric Quantities of the Encorder's Latent Spaces
Analyzing CNNs and Transformers with Information Geometry
*Ikumi AKATSUKANoboru MURATA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Encoders such as CNNs and Transformers can embed high-dimensional objects (e.g., images) into low-dimensional vectors via an object embedding operation, and many previous studies treat the latent space formed by these embedding vectors as a Euclidean space. In this study, I aim to capture the geometric structure of the latent space that may be overlooked under a purely Euclidean assumption. To this end, we propose a method that associates the encoder’s intermediate representations with probability distributions, thereby defining an information-geometric manifold on which we can estimate geometric quantities such as metrics and curvature. The set of distributions obtained by inputting an image dataset into the encoder forms an information-geometric manifold with the α-divergence as its distance, and its expectation coordinates coincide with the embedding vectors. Through experiments estimating the metric and curvature of the MNIST dataset learned by a CNN, we found that the latent space exhibits positive curvature in many regions, indicating that it is not necessarily flat.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top