This paper proposes a method to improve the prediction accuracy of Diagnosis Procedure Combination (DPC) codes in discharge summaries. First, the discharge summary data are formatted uniformly using LLM. Second, Causal Extraction is applied to extract causal information. Third, morphological analysis is performed on the original discharge summary data to create a vector of word features. Fourth, the causal information is used to highlight the features. Fifth, training examples of the classifier are generated. Finally, machine learning methods are applied to the training examples. Experimental validation results show that causal information is effective in improving the prediction accuracy of DPC codes.
Annotation of medical images is crucial for assessing cancer treatment outcomes and defining radiotherapy targets. It also plays a key role in medical AI research as a preprocessing step for machine learning models. However, the heavy workload of medical professionals limits their capacity for extensive annotation tasks. To address this, we propose a method for segmenting sequential medical images with minimal annotation effort. Building on UniverSeg, which enables few-shot segmentation without additional training, our approach iteratively enhances segmentation by incorporating each inference result into the support set. Experiments on the HVSMR dataset show that our method outperforms baseline UniverSeg.
In this paper, we present the development and evaluation of a clinical medical knowledge assessment set for large language models (LLMs), named UT-MedEval, using the detailed version of the disease ontology from the Clinical Ontology in Anatomical Structure and Disease (CONAND). UT-MedEval covers multiple medical domains, including cardiology, gastroenterology, neurology, nephrology-endocrinology, diabetes-metabolism, allergy-rheumatology, and orthopedics. It consists of 980 questions across three types of tasks (question-answering) and three response formats: free-form answers, multiple-choice (20 options), and yes/no questions. We evaluated OpenAI's GPT-4o and GPT-4o mini on this dataset. The accuracy results were as follows: GPT-4o achieved a correctness rate of 74.3% (95% CI: 71.5-77.0), while GPT-4o mini achieved 65.5% (95% CI: 62.5-68.4). The task requiring answers about the causes of diseases had the lowest accuracy.
Pancreatic cancer is one of the most difficult cancers to recovery. The survival time rate of 5 years is 10%. Analyze clinical public data, then determine the relation between metastasis sites and genetic mutation due to survival rate. The number of case data is 956 with OS < 1 year. Principal component analysis of attributes and clinical tests showed the highest contribution rate of MSI-Score as 0.45. The PCA scatter plot of subject attributes and clinical test results is divided into two groups based on genetic test information and metastasis information.