2021 年 3 巻 2 号 p. 36-42
For many years, pathologists have performed histopathological diagnoses by observing tissue specimens under a microscope. Recently, however, it has become possible to scan whole slides of tissue specimens using a dedicated slide scanner and store the resultant high-resolution digital images, i.e., whole slide images. This has led to the emergence of digital pathology, a field in which whole slide images are used for histopathological diagnoses. This field is gradually expanding, especially in large hospitals such as university hospitals. In addition, dramatic advancements in image recognition technology have been made since 2012 when deep learning won the general image recognition competition ILSVRC with overwhelming accuracy. Subsequently, deep learning has been applied to various medical images, including X-ray, ocular fundus, and skin images, and is reported to have achieved generalist- or even professional-level diagnostic accuracy in each field. Similarly, the use of deep learning, directed towards digital histopathological images, for assistance with pathological diagnoses is gradually becoming practicable, especially for diseases in which many cases occur. Recently, advanced applications have been developed such as searching for similar cases, predicting genetic mutations from histological images, and generating special stained images from hematoxylin and eosin-stained images. These emerging applications have the potential to greatly expand the field of diagnostic pathology and contribute to the further development of medicine. In this review, we introduce the use of deep learning technology in the field, detail the current advanced applications, and speculate on future perspectives.
There have been recent advances in digital pathology and image recognition technology using deep neural networks. Accordingly, the clinical applications of computer-assisted histopathological diagnosis, such as tumor detection and mitosis cell count, are becoming a reality. Although such traditional applications have been well discussed so far, more advanced applications, such as content-based image retrieval, prediction of genetic mutations from histological images, and virtual staining or stain-to-stain transformation, has not been well documented. Because these applications can have a huge impact on diagnostic pathology, this review introduces the current advanced applications and future perspectives.
Human tissues are composed of various cell types. As disease often changes the morphology, structure (e.g., glands), or composition of cells from normal tissues in target organs, disease diagnosis can be performed by observing stained tissue slides under a microscope. This practice is known as diagnostic pathology.
For many years, pathologists have observed tissue slides under a microscope. In recent years, however, some hospitals have begun to use a special slide scanner to capture and then store the entire glass slide as a digital image, known as a whole slide image (WSI), which can be used for remote diagnosis (Fig. 1). With the digitization of tissue images, information on slides can be easily sent to other hospitals; thus, remote diagnosis is becoming practicable. Importantly, with the advancement of digital image analysis technology, such as machine learning, computer-assisted diagnosis (CAD) is now also becoming a possibility.
Whole slide images in digital pathology. (A) Slide scanner (Hamamatsu Photonics NanoZoomer S60). (B) A hematoxylin and eosin-stained tissue slide and (C) the scanned image of the tissue slide (i.e., the whole slide image; scale bar: 2.5 mm). (D) Magnified image of the rectangular area in (C); scale bar: 50 µm.
Machine learning for images has a long history; however, tremendous progress has especially been made in a class of deep neural networks known as convolutional neural networks (CNNs) since the 2012 ILSVRC general image recognition competition was won with overwhelming accuracy [1]. CNNs include many layers of multiple convolutional kernels to extract specific local structures from images (e.g., a line or a circle) and the weight of these convolutional kernels is optimized through training with labeled images (a process known as supervised learning). CNNs are now widely used in image classification, object detection, and segmentation because of their superior performance over conventional methods. In histopathological image analysis, CNNs have been applied in the detection or segmentation of tumor cells [2] (preprint), detection of mitotic cells [[[, segmentation of glands [5], [6] (preprint), subtyping of tumors (such as lymphoma subtypes) [7], and grading of cancers (such as Gleason scoring of prostate cancer samples) to facilitate routine pathological diagnoses [8]. These traditional applications are important for reducing the burden on pathologists and equalizing the quality of diagnosis; however, the outstanding image recognition and feature extraction abilities of CNNs make it possible to realize more advanced and clinically important applications. These emerging applications include prediction of somatic mutations or survival time from histopathology images as well as content-based image retrieval (CBIR).
In 2014, generative adversarial networks (GANs), another important class of deep neural network, were invented by Goodfellow et al. [9] (preprint). GANs consist of two deep neural networks: a generator and a discriminator. Generators attempt to generate as realistic an image as possible, whereas discriminators (which have similar structures to CNNs) aim to discriminate between real images and the image generated by the generator. Through the competitive optimization of these two networks, generators are ultimately able to generate images that closely resemble the real images. GANs can not only create real images de novo but also transform images in one domain into those in another domain. Examples of GAN applications in general image analysis include generation of real face images with arbitrary features [10], coloring monochrome pictures [11] (preprint), and converting sketches to color photographs [12] (preprint). In diagnostic pathology, GANs are now being used to increase the resolution of microscopic images and generate special stained images from unstained or hematoxylin and eosin (H&E)-stained tissue images.
In this review, we introduce the various research applications and the limitations of such applications, i.e., the tasks that pathologists do not (which does not always mean “cannot”) do, in the standard workflows of pathological diagnosis using CNNs or GANs.
When pathologists encounter a case of which they have no prior knowledge, they often ask for the opinion of other pathologists or try to find similar images in reference materials such as pathology atlas books. This, however, can be a time-consuming process. If they could instead quickly find similar cases in a pathology image database, the probability and time for performing a correct diagnosis could be substantially improved. In image analysis, the technique of retrieving similar images by using images as queries is known as CBIR (briefly mentioned in section 1). In CBIR, each image is converted to numerical representations, such as high-dimensional vectors, that represent the characteristics of the image. Representations of two images should be closely related when the content of the images is similar and vice versa. Previously, many image representations have been used for CBIR in the field of diagnostic pathology including local binary patterns or their variants [13] and scale invariant feature transform [14]. Additionally, some methods have used unsupervised or self-supervised machine learning models to optimize image features [15] [16] (preprint).
More recently, image features extracted from deep neural networks have been shown to outperform other features in terms of retrieval accuracy [17] [18] (preprint) [19]. Although CBIR is a promising technique, CBIR systems for diagnostic pathology have been unavailable until recently, possibly due to insufficient retrieval performance and the lack of an image database with sufficient quality. Therefore, we developed and published a web service, Luigi (https://luigi-pathology.com), which allows users to search for similar images and related genomic abnormalities in over 7,000 cases from 32 cancer types included in The Cancer Genome Atlas dataset [20] [21, 22] (preprint) (Fig. 2). Luigi uses 1,024 dimensional image features, which we call “deep texture representation,” from a special type of CNN, a bilinear CNN, with positional invariance [23] (preprint). Positional invariance is important because histopathological images differ from ordinary general images, e.g., images of dogs and cats, in that they have a texture-like structure. We found that the texture information, which is extracted from a middle layer of a CNN pretrained on numerous general images, accurately expresses the characteristics of histopathological images of tumors; hence, it can be applied to retrieve similar images. We have also developed a smartphone version of Luigi; therefore, using software with a microscope lens adapter for smartphones (sold for a few dozen dollars at most), CBIR can be used in small hospitals that do not have expensive histology imaging equipment such as slide scanners and microscope digital cameras. Recently, Fuchs et al. [24] developed a freely-available CBIR system that searches similar cases across social media and PubMed.
A content-based image retrieval system: Luigi (https://luigi-pathology.com). (Left) Query image, (middle) retrieved similar images, and (right) predicted gene mutations with confidence levels. Receiver operating characteristics of the classifier on test data and histology images of mutation-positive and -negative cases are also shown.
Although CNNs have improved retrieval accuracy, CBIR-related challenges still exist in relation to pathology images. For example, there remains a need to improve robustness against stain variations. Furthermore, differences in sample preparation and equipment can affect the process. For instance, H&E stains are often used to stain pathological tissues, but color distribution varies slightly depending on the composition of the staining solution and the washing conditions (including the type of water used). The thickness of the tissue and the scanner used can also affect the color distribution and image quality of WSIs. Such differences can lead to differences in representation, which can in turn reduce the accuracy of CBIR. A simple solution to these problems is the use of image transformation techniques such as color normalization [25] (preprint) [26, 27], which adjusts the color distribution of tissue images to a reference image. However, given that the selection of a reference image can affect quality and because other differences in image quality can create biases, further research is required to refine CBIR.
Prediction of clinical biomarkersSupervised machine learning can be used to predict not only existing disease classifications but also features that are usually outside the scope of pathological diagnosis, e.g., patient prognosis and genomic features. For example, survival or recurrence of colon cancer [28], melanoma [29], or mesothelioma [30] can be predicted from H&E-stained images using CNN. Furthermore, Li et al. [31] incorporated topological structures of WSIs using graph CNNs to predict the survival of patients with lung or brain cancer.
Recent advances in parallel sequencers have made it possible to comprehensively examine cancer genome information, which has led to genome-based diagnosis and drug selection. Although cancer genome analysis is a powerful tool, it cannot realistically be used for every patient because of its high costs. In contrast, histopathological diagnosis is an essential tool that is routinely used. If it becomes possible to predict genomic aberrations from histology images alone, histopathological diagnosis will represent an inexpensive screening tool with particular relevance to developing countries where genome analysis is not easily accessible. Even before the digitization of histopathological images, the association between genetic mutations and histological morphology in various cancers has been studied by pathologists [32,33,34]. However, because multiple combinations of driver genes and cancer types exist, it remains difficult for pathologists to identify such associations on a large scale. To overcome these challenges, supervised learning models that predict mutations from histopathology data are a useful application. For example, Coudray et al. [35] showed that the presence or absence of six somatic gene mutations, including mutations for TP53, EGFR, and KRAS, could be accurately estimated from WSIs of lung cancer using CNNs. Subsequently, it has been shown that SPOP mutations in prostate cancer [36] (preprint) and BRAF and NRAS mutations in melanoma [37] (preprint) can be estimated from histopathological images. Prediction of actionable genomic aberrations from H&E images alone has received substantial attention because of its potential clinical utility. For example, some researchers have reported that patients with microsatellite instability tumors in some cancer types, such as stomach or colon adenocarcinoma, respond to immune checkpoint inhibitors and that microsatellite instability status can be predicted with high accuracy when assessing H&E-stained images using a deep neural network [38, 39]. Furthermore, in some studies, including our own, prediction models have been developed from tens of clinically actionable mutations [22, 40, 41]. Some of our models are implemented in the abovementioned Luigi system.
Other attempts have been made to predict prognosis with greater accuracy using the combination of cancer genome information and histological images. For example, Mobadersany et al. [42] used a specialist CNN known as the genomic survival convolutional network to analyze histopathological images of brain tumors and related genomic abnormality information. In addition, Cheerla et al. [43] showed that the prognosis of brain tumors could be accurately estimated using such methods, albeit to a lesser extent than by analyzing genomic abnormalities alone.
Virtual stainingDepending on the purpose, tissue specimens are stained in various ways before being examined under a microscope. Although these steps are essential for histopathological diagnosis, they are time consuming, labor intensive, and expensive especially for special stains. Additionally, with small tissue specimens, it may not be possible to obtain sufficient tissue sections for multiple staining. In recent years, a technology has been developed by which unstained specimens can be virtually stained; this technology learns the correspondence between unstained and stained specimens through image conversion technologies such as GANs [44, 45]. Another technique has been proposed in which images that have already been stained using one method (mostly with H&E) are converted to appear stained by other special methods [46] (preprint) [47]; this is known as stain-to-stain transformation [48]. GANs are usually trained using paired images. When unstained images are virtually stained by GANs, paired images for training can be prepared for the exact same tissue; however, for stain-to-stain transformations, images of serial sections that differ slightly are usually prepared and two sections are each stained with different stains. It is relatively easy to align unstained specimens with stained specimens because their positions are perfectly matched; thus, accurate methods for paired-images, such as pix2pix [49] (preprint), can be used (Fig. 3A). In contrast, during the conversion from H&E staining to special staining, the two images from serial sections differ slightly. Although the accuracy is lower than that obtained when using paired images, the CycleGAN [50] (preprint) technique can be applied to such unpaired images (Fig. 3B).
Although these techniques seem promising, current evaluations are mainly based on reconstruction accuracy; indeed, few studies have evaluated the improvements in the quality of diagnoses [51] (preprint). Therefore, the clinical utility of the techniques in terms of final diagnostic accuracy and time/cost reduction in various applications should be assessed in future research.
Limitations of deep neural networks in diagnostic pathologyAlthough deep neural networks are promising technologies, they have some limitations when applied to diagnostic pathology. First, the number of training samples is often insufficient, especially with rare diseases for which few training samples are available; thus, it is difficult to develop a machine learning model with good performance. Therefore, techniques that can learn accurately even from a small number of samples such as few-shot learning or few-shot GAN, are often necessary [52, 53] (preprint).
Interpreting the resultant models can also be a problem. For instance, if a category, e.g., mutation status, is predictable, it is important to know the features that were used to predict it to determine whether the model is biased and whether it makes biological sense. Algorithms for presenting the regions that contribute to the decision making of deep neural networks have been proposed, e.g., GradCAM [54] and integrated gradients [55] (preprint). However, it is sometimes difficult to interpret the pathological images by presenting only specific areas of the image, which can usually be achieved with general images. Such problems from deep neural networks also exist for CBIR and virtual staining. Thus, a visualization method that is specifically suitable for pathological images should be investigated further.
Previous research has proved that deep learning algorithms used with histopathology images are effective. In future research, it will be important to explore the possibility of using these models in actual pathological diagnoses for their ability to improve the accuracy of the diagnoses and reduce costs when incorporated into a system. To improve the usefulness of the applications themselves, the accumulation of data, including training data, will be important. The establishment of a system by which to accumulate data from many medical institutions and make it available to many researchers and doctors is expected to bring these applications closer to practical use.
The authors declare that they have no conflict of interests.
This work was supported by the AMED Practical Research for Innovative Cancer Control [Grant Number JP 20ck0106400h9903 and 21ck0106640 to S.I.] and JSPS KAKENHI Grant-in-Aid for Scientific Research (B) [Grant Number 21H03836 to D.K].