Bulletin of the Computational Statistics of Japan
Online ISSN : 2189-9789
Print ISSN : 0914-8930
ISSN-L : 0914-8930
Original Papers
A BAYESIAN NONPARAMETRIC TOPIC MODEL FOR MICROBIAL DATA MEASURED IN TIME SERIES
Tasuku Okui
Author information
JOURNAL FREE ACCESS

2019 Volume 32 Issue 2 Pages 119-133

Details
Abstract
 Using the 16S rRNA sequence analysis, which analyzes the 16S rRNA region of the whole microbial genome, compositional data of microbial species can be obtained nowadays. As an analysis method for these data, the latent Dirichlet allocation (LDA) model has been proposed as a dimension reduction method.
 Microbiome data from the 16S rRNA sequence analysis are often measured in time series to observe the changes in the microbial environment of a subject over time. As an LDA model for time-series data, the dynamic topic model (DTM) is often used. Although the number of topics need to be pre-specified when using the DTM, the number of topics from the data may be automatically deduced by extending the DTM model to a Bayesian nonparametric model. Therefore, a Bayesian nonparametric topic model for microbiome data measured in time series was proposed and compared to the DTM using real microbiome data. As a result, using the proposed model, the topic proportions of only a few topics became averagely large regardless of the pre-specified number of topics. In addition, the number of topics whose proportion became the largest for any subject did not change depending on the pre-specified number of topics. Therefore, it was suggested that the number of topics from microbiome data could be automatically decided using this proposed model.
Content from these authors
© 2019 Japanese Society of Computational Statistics
Previous article Next article
feedback
Top