The innovations of highly accurate machine learning models (such as deep neural networks) boosted their use in society. However, the structures of such high-end models are highly complex and it is difficult to understand their inference processes nor decision criteria. In this paper, I introduce some popular approaches for explaining the decisions of complex machine learning models.
With the recent advance of deep learning technology, research of artificial intelligence (AI) applications for the medical domain has been growing. However, a lack of interpretability (or explainability) of the basis of AI decisions is a major problem for practical use of AI. Since physicians are responsible for the diagnosis, it is desirable to be able to confirm the reliability of the decisions made by AI. We developed an XAI technique to realize an AI platform, which allows co-evolve both humans and AI via interaction between them and applied it to pathological image analysis.
In the field of image recognition, deep learning has achieved high recognition performance. However, it is difficult for people to interpret the network for the decisions. Visual explanations can visualize the regions that the network gazed at during the inference by using attention maps. By visualizing the attention map, we can understand the basis of the AIʼs decision. The attention map used for visual explanation can be applied to the inference process using the attention mechanism to improve the recognition performance. Furthermore, the attention map can be manually adjusted and trained to introduce human knowledge into the network. This paper presents a brief survey of a visual explanation of deep learning and our research on visualization of the attention map by the attention mechanism.
The paper introduces a framework for Formula-Driven Supervised Learning (FDSL) which automatically generates image patterns and their image labels for creating a large-scale image dataset. We mainly focus on Fractal DataBase (FractalDB) which consists of fractal geometry existing around our real world. That is, one of the important natural principle enables to train convolutional neural networks without any natural images taken by a camera and human-annotated labels in a pre-training phase.
In order to set up a practical experiment and research environment for deep learning, it is necessary to build an computing environment using GPU. This is because the calculation speed is several tens of times faster than the calculation using only the CPU. In this course, we will explain how to build a GPU environment and clarify how much the calculation time changes, using image segmentation as an example. At the same time, the method of image segmentation using U-Net model and the construction of the label data are also mentioned.