Abstract
We tried to get knowledge about the treatment process by using the reinforcement learning based on the MDP environment model as well as medical records of cardiovascular internal medicine especially two kinds of biochemical values (BNP and creatinine) and medication profiles. It is an important future problem to improve configuration of the reward function.