計量生物学
Online ISSN : 2185-6494
Print ISSN : 0918-4430
ISSN-L : 0918-4430
最新号
選択された号の論文の5件中1~5を表示しています
特集 診療二次データを用いた統計解析
総説
  • 弘 新太郎
    原稿種別: 総説
    2025 年46 巻2 号 p. 63-74
    発行日: 2025/11/30
    公開日: 2025/10/30
    ジャーナル フリー

    This paper outlines the recent expansion of the utilization of medical real-world data (RWD) in the clinical development and post-marketing evaluation of drugs in the United States, European Union, and Japan. We provide an overview of the necessary knowledge for researchers to understand the capabilities and limitations of RWD compared to clinical trials. Recognizing the importance of daily clinical practice is essential for effectively utilizing RWD, and it is crucial to formulate relevant research questions that can be addressed using this data. The paper covers key areas including: ① data sources of RWD, ② pharmacoepidemiology, and ③ pertinent laws for using RWD in Japan. Under the common understanding of the difference between non-interventional studies with RWD and clinical trials, the paper discusses the challenges and future prospects of using medical RWD in Japan and suggests statisticians to prepare enhancing knowledge of daily medical records and handling of log formatted unstructured data for these challenges.

  • 大山 哲司
    原稿種別: 総説
    2025 年46 巻2 号 p. 75-100
    発行日: 2025/11/30
    公開日: 2025/10/30
    ジャーナル フリー

    In recent years, the use of medical information databases has increased. When conducting research using medical information databases, it is necessary to define outcomes, exposures, and confounding factors that align with the research objectives from the information in the database. At this time, validation is required to evaluate the extent to which true cases can be identified. The presence or absence of disease is determined by chart review by multiple raters, and inter-rater reliability is evaluated. As the use of medical information databases increases, opportunities to conduct such reliability studies will increase. Therefore, in this paper, we will review measurement reliability and how to estimate the intraclass correlation coefficient and kappa coefficient, which are used as reliability indicators, while also referring to recent research.

  • 古川 恭治, 熊野 夏海, 川添 百合香, 中倉 章祥
    原稿種別: 総説
    2025 年46 巻2 号 p. 101-134
    発行日: 2025/11/30
    公開日: 2025/10/30
    ジャーナル フリー

    While collecting a complete dataset with no missing or inaccurate measurements is ideal, it is very rare due to a number of reasons. Incompleteness in data can introduce bias and/or information loss in estimating the relationship between the factors of interest and the outcome, potentially reducing the quality and validity of research findings. In epidemiological observational studies, these sources of bias have been increasingly widespread as the use of real-world data grows, which are not collected as planned, such as insurance claims databases and electronic medical records. This will increase the importance of controlling for the bias sources in statistical analyses. This article focuses on two important issues of incompleteness in data analysis: missing data and measurement error, and discusses statistical approaches to address them.

  • 野原 康伸
    原稿種別: 総説
    2025 年46 巻2 号 p. 135-151
    発行日: 2025/11/30
    公開日: 2025/10/30
    ジャーナル フリー

    In recent years, artificial intelligence (AI) has become deeply embedded in our daily lives, with machine learning—one of its key components—gaining increasing attention. Although machine learning can achieve high predictive accuracy, it has not been widely adopted in data analysis due to the difficulty in interpreting its results. However, this barrier is beginning to break down with the advent of Explainable AI (XAI) technologies. Among the various types of machine learning algorithms, treebased methods—such as decision trees and ensemble trees—are particularly notable.For tabular data commonly found in medical records, ensemble tree methods often outperform other approaches in both accuracy and scalability. This paper focuses on building machine learning models using ensemble trees and interpreting them with SHAP (Shapley Additive Explanations), a widely used XAI technique. Ensemble tree models can be constructed almost automatically once the data is properly prepared.By using SHAP summary plots and dependence plots, we can gain insights into the overall structure of the data without requiring domain-specific expertise. Although these results may include confounding factors, this approach can still be valuable for uncovering potential medical knowledge.

feedback
Top