Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
39th (2025)
Session ID : 2S4-GS-2-05
Conference information

Segmentation of Multinomial Distribution Regimes in Categorical Series Data and Investigation of Model Selection Criteria
*Yuki TAKEISHIAoi HAGITARyoichiro YAMAZAKIKento TAKAIJoy TANIGUCHIYuki YAMAGISHI
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Categorical sequential data often contain redundant information and become difficult to interpret visually when their variations are complex. In this study, we assume that the observed data follow different multinomial distributions across multiple regimes and attempt to detect regime breakpoints using dynamic programming based on maximum likelihood estimation. While a similar method, the Pruned Exact Linear Time (commonly known as the PELT method), exists, it is primarily designed for one-dimensional data; when handling multidimensional data such as categorical data, the objective function must be expanded according to the number of dimensions. The proposed method is expected to contribute to the understanding of sequential variations and enhance interpretability in categorical data analysis, serving as a useful analytical foundation, especially for datasets with complex variation characteristics. In the evaluation experiments, we investigate appropriate evaluation criteria (AIC, BIC, MDL, and the elbow method) to mitigate model overfitting, clarifying the tendencies of model selection for each criterion. In particular, we focus on the L method, an automated technique for the elbow method, and examine its behavior.

Content from these authors
© 2025 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top