For developing convolutional neural networks (CNNs) for medical applications to replace invasive examinations with non-invasive ones, collecting a training dataset requires performing invasive examinations on patients. Therefore, in situations where a CNN is developed with a limited-sized training dataset, pointing out that the dataset size is too small without evidence is not desirable. We should collect new samples only when an improvement in estimation performance can be expected. Therefore, we verified whether we should collect more samples for the medical CNN developed in the previous research. In particular, by using the dataset for developing a CNN for estimating pulmonary artery wedge pressure (PAWP) from a chest radiograph, we built a CNN while increasing the dataset size and observed the changes in estimation performance and saliency maps. As a result, we verified that the changes in estimation performance do not converge. Moreover, during estimation, the CNN developed with a small number of samples checks a wide cardiac region, while the one developed with a large number of samples checks a narrower cardiac region. From this, in this case, increasing the training data is expected to improve both generalization performance and saliency maps.
View full abstract