Japanese Journal of Psychosomatic Medicine
Online ISSN : 2189-5996
Print ISSN : 0385-0307
ISSN-L : 0385-0307
Special Issues / Statistics for Interpreting and Practicing Psychosomatic Medicine Research
Statistical Semantic Analysis with the Structural Topic Model for Qualitative Data
Seiji Muranaka
Author information
JOURNAL FREE ACCESS

2021 Volume 61 Issue 8 Pages 715-721

Details
Abstract

This study introduces a structural topic model (STM), a natural language processing (NLP), technology for qualitative data analysis. Natural language is a spontaneous language for human interaction. Extant studies in psychology frequently use conventional qualitative analysis for text data, including the KJ method and grounded theory. However, these methods may lead to several problems, such as the reproduction of results. NLP technology improves the computer processing of language and inferences. STM is a statistical approach that uses Latent Dirichlet Allocation (LDA) to generate observed words-based latent variables called topics. It has three components ; a topic prevalence model, a topical content model, and a core language (or observation) model. STM can incorporate covariates in the topic prevalence model and topical content model for generating quantitative results. It can have higher reproduction than conventional qualitative analysis while extracting the meanings of documents as topics. In this study, we apply STM using a famous Japanese literary work by Soseki Natsume, called Kokoro, having 1300 paragraphs. The plot revolves around a man and his older teacher, whose life contained a secret involving a woman named Ojosan. This secret had culminated in his friend’s and eventually his suicide. The novel reflects a generational shift in values and the uncertainty of human behavior, all under the shadow of the central theme of death. The novel was morphologically analyzed to remove stop words, followed by processing through the STM model. Preprocessing was done using the MeCab software for morphological analysis with the mecab-ipadic-NEologd dictionary. We used STM packages in the STM training and visualization to create a word cloud and a correlational network diagram of the topics. The searchK function included in the STM package produced reasonable modeling with 11 topics. Out of these 11, Topic 4 (conflict for the new generation) indicated ambivalent emotion for the changing times, which is strongly related to the story synopsis discussed above. Topic 4 is correlated to Topic 6 (the gap between image and fact), Topic 7 (teacher’s self-punishing attitude), Topic 8 (things that cannot be avoided), Topic 9 (repetitive thinking), and Topic 11 (innermost feeling). Besides, we found correlations among Topic 2 (Ojosan, the girl), Topic 3 (pride), and Topic 5 (Okusan : the wife and Ojosan’s mother), and narrative reviews highlight these relationships. Although STM has some practical concerns, such a preprocessing the text, it may be suitable for clinical studies.

Content from these authors
© 2021 Japanese Society of Psychosomatic Medicine
Previous article Next article
feedback
Top