2025 Volume 50 Issue 11 Pages 637-647
Morphological observation and classification of bone marrow cells in smear specimens is an important examination in toxicity studies for pharmaceuticals. However, acquiring the expertise for classifying bone marrow cells using light microscopy requires years of training, resulting in significant labor and time. To efficiently acquire accurate and objective data without oversight, a system for the automated detection and morphological classification of rat bone marrow cells was developed through machine learning using whole slide images (WSIs) obtained from smear specimens. Our system integrates SSD300 for object detection, VDSR for image super-resolution, and EfficientNetV2B0 for classification. WSIs of rat bone marrow smear specimens were obtained at 40× magnification using a WSI scanner. The fine-tuning of the bone marrow cell detection model using SSD300 was performed with 720 images obtained from WSIs of rat bone marrow smear specimens. The morphological classification model for 13 types of bone marrow cells using VDSR-EfficientNetV2B0 was optimized with a total of 144,000 cell images. The system for detection of bone mallow cells achieved an average precision of 79%. Additionally, the morphological classification achieved an accuracy of 98% when compared to expert classification. Our algorithm enabled the automated classification of cells on rat bone marrow smear specimens with extremely high accuracy and in a short time, approximately 80 sec, to classify 5,000 cells per image, without oversight. This capability suggests that the algorithm could potentially be utilized as a supportive tool for the toxicity evaluation of bone marrow smear specimens.
A bone marrow smear examination involves analyzing the cell morphology of bone marrow cells using bone marrow smear to classify bone marrow cells into various types of cells, which is an important examination in toxicity studies to evaluate the effects of a compound on the bone marrow. However, the classification of bone marrow cell morphology is typically performed manually by an expert using light microscopy. This process not only requires a large amount of labor and time but is also affected by the subjective perspective of the expert, leading to issues such as inconsistency between different examiners and low reproducibility of judgments. Furthermore, because the task is highly specialized, it requires many years of experience for training to be an expert using a huge number of bone marrow smear images. Therefore, the use of digital microscopy and machine learning in this field holds great potential to achieve more accurate and objective results while minimizing human intervention.
In recent years, many AI-based diagnostic imaging efforts have been made in the medical field (Lv et al., 2023). However, there are still few clinical reports on the evaluation of bone marrow smear images (Tayebi et al., 2022), and to our knowledge, only one case has been reported for nonclinical toxicity studies (Yamaguchi et al., 2024). Furthermore, no publicly available datasets of rat bone marrow smear images exist to support AI training. High-resolution bone marrow cell imaging typically necessitates manual operation with a camera mounted on a high-magnification light microscope, a procedure that requires a large amount of labor and time because of the extensive area and numerous images needed. In contrast, the use of whole slide image (WSI) scanners enables high-throughput acquisition of images over wide areas, thereby reducing both labor and time. However, WSIs are commonly obtained at low magnification, which results in inadequate resolution for capturing fine bone marrow structures and this restricts classification accuracy. To fully exploit WSIs for automated bone marrow cell analysis, it is therefore essential to overcome these image-quality limitations while preserving high-throughput acquisition, which underscores the need for further research.
In this study, we aimed to develop a high-throughput automated AI system for rat bone marrow smear images that addresses the challenges of labor-intensive manual classification and the scarcity of annotated data. To achieve this, we developed an automated classification model for rat bone marrow smear images using WSIs captured with a 40× objective lens, which will be a useful tool in toxicity studies. Training models based on neural networks typically require a large amount of labeled data; however, it appears that publicly available datasets for rat bone marrow smear images are not existent at this time. Constructing such datasets from scratch would demand considerable time, effort, and specialized expertise. To overcome this limitation, we employed cross-domain learning and fine-tuning techniques to build an effective classification model using a limited number of annotated rat bone marrow smear images. Additionally, the low-resolution images obtained from WSIs were enhanced to improve resolution prior to classification.
This approach enabled us to markedly reduce the required workload and time for dataset creation while maintaining the performance necessary for accurate model training.
In this study, we propose an artificial intelligence (AI) architecture for the automatic detection and morphological classification of bone marrow cells from rat bone marrow smear images. The proposed method employs the Single Shot MultiBox Detector (SSD300 model) for cell detection (Liu et al., 2016), and a convolutional neural network (EfficientNetV2B0 model) for morphological classification (Tan and Le, 2021). A flowchart illustrating the optimization strategies for these models is shown in Fig. 1.

Flowchart of the optimization strategies for an automated rat bone marrow cell classification system.Dark blue nodes represent model optimization based on rat bone marrow images, whereas light blue nodes represent optimization using images derived from non-rat sources.
The following software tools have been utilized at this work: keras 3.4.1, tensorflow 2.17.0, openSlide-python 1.3.1, numpy 1.26.4, seaborn 0.13.2 and pillow 10.4.0 library. Regarding hardware, a computer server (Dell Inc., USA) with Xeon W5-2465X (Intel Corporation, USA) and RTX A6000 48GB (NVIDIA Corporation, USA) were used.
Creation of WSIsWe prepared WSIs for the AI system development. A total of 15 May-Grünwald-Giemsa -stained glass slides of rat bone marrow smears from a toxicity study conducted at Japan Tobacco, Inc. were scanned, using a NanoZoomer S360 (Hamamatsu Photonics K.K., Japan) at 40× magnification, and converted into WSIs. Bone marrow smears were prepared from fifteen male Sprague-Dawley (Crl:CD(SD)) rats (Charles River Laboratories Japan, Inc.), six weeks of age at the initiation of dosing. The animals were randomly allocated into three groups (5 animals per group) and were administered water for injection, corn oil, or the test compound for two weeks. At necropsy, bone marrow was collected from the femur and then suspended in EDTA-2K-treated inactivated serum. A drop of the suspension was placed onto a glass slide and smeared using a benchtop centrifugal smearing device (SPINNER 2000, Lion Power Co., Ltd., Japan). The smears were then air-dried and stained with May-Grünwald-Giemsa. WSIs were generated using a NanoZoomer S360 with a semi-automatic scanning mode. The scanned area covered approximately 90,000 × 110,000 pixels (230 nm/pixel), including the central region of the glass slide. Scanning was performed at 40× magnification in a single layer, and the images were saved in NanoZoomer Digital Pathology Image (NDPI) format.
Pretraining of the SSD300 model using blood cells smear images for dataset creationSince no publicity available datasets of rat bone marrow smear images scanned with a 40× objective lens with annotated bone marrow cells exist, it was necessary to construct a dataset suitable for optimizing the SSD300 model for rat bone marrow cell detection. To facilitate the creation of this dataset, we hypothesized that a detection model trained to identify white blood cells with a similar morphology to bone marrow cells could serve as a foundation. Accordingly, we developed a white blood cell detection model based on SSD300 model using blood smear images and corresponding annotation data for red blood cells, white blood cells, and platelets from the Blood Cell Images Dataset provided by Kaggle (https://www.kaggle.com/datasets/paultimothymooney/blood-cells). Kaggle is a well-known online community for individuals engaged in data science and machine learning. The images in the Blood Cell Images Dataset are high-resolution (640×480 pixels). However, the size of white blood cells in these images does not match the size of the bone marrow cells observed in rat bone marrow smear images scanned with a 40× objective lens. To address this discrepancy, we randomly combined four images to create 1280×960-pixel composite images and then resized them to 320×240 pixels to approximate the size of cells captured under the 40× objective lens. A total of 364 blood smear images were generated, of which 291 images were used as the training set and 73 images as the validation set. During the optimization of the SSD300 model, the convolutional layers up to the third layer were fixed using pretrained weights from ImageNet (Deng et al., 2009). The layers from the fourth onward were retrained from scratch with blood smear images to optimize the model for cell detection. Adaptive Moment Estimation (Adam) (Kingma and Ba, 2017) was used for optimization of the model weights, and Multibox Loss Function, including cross-entropy and smooth L1 (Liu et al., 2016), was used for the loss calculation. The hyperparameter of Adam was applied with a learning rate of 0.001. During optimization, saturation, brightness, contrast, channel shift, lighting, horizontal flip, and vertical flip were applied randomly to the images for data augmentation.
Creation of annotated rat bone marrow smear images for training the SSD300 modelSince there is no existing dataset annotated with bone marrow cell labels for rat bone marrow smear images scanned with a 40× objective lens, it was necessary to create a dataset for the purpose of optimizing the SSD300 model. Rat bone marrow smear images were randomly extracted from 8 WSI images scanned with a 40× objective lens, using rat bone marrow smear slides prepared in our laboratory. A total of 800 images, each 300×300 pixels in size, were obtained. Annotation labels for bone marrow cells, red blood cells, and platelets were applied to rat bone marrow smear images using a pre-optimized SSD300 model, which was based on blood smear images provided by Kaggle. Subsequently, we manually corrected inappropriate annotations and applied three types of annotation labels based on the staining characteristics of the bone marrow cells. Additionally, the annotation labels for platelets and areas identified as debris in the images were modified, resulting in five types (red blood cell, bone marrow cell-1, bone marrow cell-2, bone marrow cell-3 and platelet) of annotation labels. The images were assigned to the training set (720 images), and test set (80 images).
Dataset generation of simulated low-resolution bone marrow cell images for training the VDSR modelThe VDSR model is a single-image super-resolution technique (Kim et al., 2016) using deep learning, incorporating convolutional layers and skip connections. The maximum magnification of the objective lens included with a general-purpose WSI scanner is 40×, and a single bone marrow cell image obtained from a rat bone marrow smear scanned with a 40× objective lens, is approximately 80×80 pixels in size and has a low resolution. However, the image input size for EfficientNetV2B0 model, which is used for classifying the morphology of bone marrow cells, is 224×224 pixels, requiring resizing. The generally used interpolation methods for image resizing are bilinear and bicubic interpolation, which can cause a loss of sharpness. Therefore, the input images for EfficientNetV2B0 model were generated by upscaling the resolution and resizing using the VDSR model.
The dataset used for optimizing the VDSR model was modified from the human bone marrow cell dataset provided by Kaggle (https://www.kaggle.com/datasets/andrewmvd/bone-marrow-cell-classification). Each image in this dataset consists of a single high-resolution (250×250 pixels) human bone marrow cell. To simulate the appearance of low-resolution images similar to those of rat bone marrow cells, we downscaled the images to 30×30 pixels and subsequently upscaled them to 224×224 pixels using bicubic interpolation. The resulting dataset comprised 104,000 images for training and 26,000 images for validation. In parallel, images of the human bone marrow cell dataset resized directly from 250×250 to 224×224 pixels were used as supervised data during the optimization process.
Dataset creation of human bone marrow cell images for pre-training the EfficientNetV2B0 modelAs shown in Table 1, the dataset used to optimize the deep neural network for morphological classification of bone marrow cells was created using images of 13 morphological types obtained from the human bone marrow cell dataset provided by Kaggle (https://www.kaggle.com/datasets/andrewmvd/bone-marrow-cell-classification). Since the number of the human bone marrow cell images used for EfficientNetV2B0 model optimization was insufficient depending on the cell morphology, the number of images was augmented by random rotation, resulting in 8,000 images used for the training set, 2,000 images for the validation set and 150 images for the test set for each morphology. The image size was resized from 250×250 pixels to 244×244 pixels.

Given the absence of an existing dataset of classified rat bone marrow cell images for optimizing a deep neural network model for morphological classification of rat bone marrow cells, a new dataset was developed from scratch. The rat bone marrow smear slides prepared in our laboratory were scanned with a 40× objective lens to obtain 15 WSI images. The SSD300 model for rat bone marrow detection was applied to the 2 WSI images, resulting in the acquisition of 10,000 rat bone marrow cell images. Morphological classification was then manually performed by experts in our laboratory. Images for which at least two out of three experts classified the cells as the same morphology were included in the dataset. Images for which morphological classification could not be determined, even by experts, were excluded from the dataset. For certain morphological types of bone marrow cells, the number of available images was insufficient due to their low prevalence in vivo. Therefore, image augmentation was performed using random rotation and random incorporation of partial images from the VDSR model's training set. After resizing to 244×244 pixels, the dataset consisted of 40,049 training images, 4,449 validation images, and 1,300 test images.
Fine-tuning of the SSD300 model for rat bone marrow cell detectionThe SSD300 model, which had been pre-optimized for blood smear images, was fine-tuned for detecting bone marrow cells in rat bone marrow smear images. For the fine-tuning of the model, rat bone marrow smear images of 300×300 pixels, each labeled with five types (red blood cell, bone marrow cell-1, bone marrow cell-2, bone marrow cell-3 and platelet) of annotations, were used. The SSD300 model for detecting rat bone marrow cells was optimized using the same procedure as that employed for optimizing the SSD300 model for white blood cell detection.
Pre-optimization of the VDSR model for high-resolution bone marrow cell imagesThe VDSR model, consisting of 20 convolutional layers, was designed and optimized from scratch to enhance the resolution of the individual bone marrow cell images obtained from rat bone marrow smears scanned using a 40× objective lens. Adam was used for optimization, mean squared error (MSE) as the loss function and peak signal-to-noise ratio (PSNR) as the evaluation metric (Keleş et al., 2021). The hyperparameters for Adam were set with a learning rate of 0.001.
Pre-optimization of the EfficientNetV2B0 model using human bone marrow cell imagesEfficientNetV2B0 is a deep learning model in which mobile inverted bottleneck convolution (MBConv) block and fused mobile inverted bottleneck convolution (Fused-MBConv) block are repeatedly incorporated, with a total of six repeated stages (Tan and Le, 2021). Two additional dense layers and a dropout layer were added to the EfficientNetV2B0 model to classify human bone marrow cells into 13 different morphologies, and the model was optimized. The model parameters were optimized using Adam, categorical cross-entropy as the loss function and accuracy, precision, and recall as the evaluation metrics. All layers of the model were optimized from scratch. The hyperparameters for Adam were set with a learning rate of 0.001. During optimization, contrast, brightness, vertical flip and horizontal flip were applied randomly to the images for data augmentation.
Fine-tuning of the VDSR-EfficientNetV2B0 model using rat bone marrow cell imagesThe bone marrow cell morphology classification model based on EfficientNetV2B0 was pre-optimized using high-resolution bone marrow cell images from the human bone marrow cell dataset provided by Kaggle. The model was fine-tuned using rat bone marrow cell images to classify the morphology of the rat bone marrow cells. However, when single bone marrow cell images are extracted from the rat bone marrow smear images scanned with a 40× objective lens, the resulting images are typically 80×80 pixels in size. When the images are input into the EfficientNetV2B0 model, they are resized to 244×244 using bilinear interpolation, resulting in a loss of sharpness. To address this issue, the high-resolution technique VDSR was applied for resizing. A sequentially connected model based on VDSR and EfficientNetV2B0 was constructed for the classification of rat bone marrow cell morphology, and the parameters were fine-tuned. The parameters of VDSR and the EfficientNetV2B0 model were initially set to pre-optimized values, and the parameters of the EfficientNetV2B0 model from the Stage 6 onward were subsequently fine-tuned for the classification of 13 morphologies. The model parameters were optimized using Adam, categorical cross-entropy as the loss function and accuracy, precision, and recall as the evaluation metrics. The hyperparameters for Adam were set with a learning rate of 0.001. When the high-resolution images generated by VDSR model were input into the EfficientNetV2B0 model, contrast, brightness, vertical flip and horizontal flip were randomly applied for data augmentation.
Automated AI system for detection and classification of rat bone marrow cellsEach WSI was initially divided into large tiles, which were subsequently subdivided into smaller tiles. A subset of the large tiles was randomly selected, and several small tiles were then randomly extracted from each one. This procedure yielded a dataset of rat bone marrow smear images. These images were first processed using an optimized SSD300 model to detect bone marrow cells. From the detected regions, some areas were randomly selected and used as target images for morphological classification. The sequentially connected rat bone marrow cell morphology classification model, based on optimized VDSR-EfficientNetV2B0 model, was applied to classify the rat bone marrow cells into 13 morphologies.
To evaluate the performance of the optimized SSD300 model as a detection model for bone marrow cells, the AP was used and the mean average precision (mAP) (Everingham et al., 2010) was calculated for all the AP values. A test set consisting of 80 rat bone marrow smear images with annotations for 5 classes (red blood cell, bone marrow cell-1, bone marrow cell-2, bone marrow cell-3 and platelet) was used. For the evaluation of the model capable of detecting both bone marrow cells and erythrocytes, the three bone marrow cell classes (bone marrow cell-1, bone marrow cell-2, and bone marrow cell-3), which were annotated based on staining characteristics, were consolidated into a single class. Additionally, classification of the platelets was excluded from the evaluation. The test set was applied to the rat bone marrow cell detection model and detection regions with a confidence score of 0.8 or higher were attributed to bone marrow cells and erythrocytes. The AP values for the bone marrow cells and erythrocytes were 70.50% and 87.80%, respectively. Additionally, the mAP was 79.15%. In object detection models, a mAP value of 0.75 or higher is generally considered indicative of a high-precision model. Fig. 2 shows bone marrow cells and erythrocytes detected using the SSD300 model.

Detection of bone marrow cells and erythrocytes using the SSD300 model. A 300×300 pixels WSI scanned with a 40× objective lens from a rat bone marrow smear sample. Cells within the cyan squares represent bone marrow cells, while those within the magenta squares represent erythrocytes.
To enhance the resolution of the bone marrow cell images, the VDSR model was optimized using simulated low-resolution bone marrow cell images, comprising 104,000 images for training and 26,000 images for validation. As a result, the PSNR for the validation set was 33.45 dB. In general, a PSNR value of 35 dB or higher is considered indicative of high-quality image reconstruction. The original images, low-resolution images, and high-resolution images generated using the VDSR model are shown in Fig. 3. The generated high-resolution images, while not fully restoring the original images, successfully restored the fine structures of the bone marrow cells. As a result, bone marrow cells that were difficult to classify morphologically even by experts because they were low-resolution images became identifiable through resolution enhancement. Image enhancement using VDSR resulted in effects comparable to those observed when bone marrow cells are captured with a high-magnification objective lens, such as 100×.

Super-resolution of bone marrow cell image using the VDSR model. The left image represents the original high-resolution bone marrow cell image. The middle image shows bone marrow cell image with artificially reduced resolution. The right image presents the super-resolved bone marrow cell image obtained using the VDSR model. Image resolution was enhanced using the VDSR model, making intracellular structures clearer and enabling morphological classification comparable to that of the original high-resolution images.
The EfficientNetV2B0 model, serving as the backbone of the rat bone marrow cell morphology classification model, was optimized using a training set of 8,000 images and a validation set of 2,000 images for each morphology derived from the human bone marrow cell dataset. As a result, accuracy, precision, and recall for the validation set were 89.52%, 89.86% and 89.32%, respectively, indicating that the model achieved robust classification performance. A confusion matrix of the test set for evaluating the performance of each classification is shown in Fig. 4. Mix-ups were observed between the categories of promyelocyte and myelocyte, as well as between band neutrophils and segmented neutrophils. However, since these morphologies represent continuous stages in the differentiation of the bone marrow cells, intermediate forms are often present. Therefore, such mix-ups are considered tolerable within the model's classification performance.

Summary of human bone marrow cell classification results obtained from the EfficientNetV2B0 model predictions. A confusion matrix was constructed based on the evaluation of 150 images per morphology. Additionally, for some morphologies with insufficient image counts, additional human bone marrow cell images were incorporated. The values within the confusion matrix represent the number of bone marrow cell images classified into each morphology. The vertical axis indicates the ground truth annotated by Kaggle, while the horizontal axis represents the predicted results by the EfficientNetV2B0 model.
A sequentially connected model based on the VDSR and EfficientNetV2B0 models was fine-tuned for rat bone marrow cell morphology classification. The model was fine-tuned using a training set of 40,049 images and a validation set of 4,449 images, both of which were manually classified by the experts. As a result, accuracy, precision, and recall for the validation set were 98.1%, 98.1%, and 97.4%, respectively, indicating that the model achieved robust classification performance. A confusion matrix of the test set for evaluating the performance of each classification is shown in Fig. 5.

Summary of rat bone marrow cell classification results obtained from VDSR-EfficientNetV2B0 model predictions. A confusion matrix was constructed based on the evaluation of 100 rat bone marrow cell images per morphology. The values within the confusion matrix represent the number of bone marrow cell images classified into each morphology. The vertical axis indicates the ground truth determined by the experts, while the horizontal axis represents the predicted results by VDSR-EfficientNetV2B0 model.
A system was developed to automatically detect bone marrow cells from the WSI of rat bone marrow smear slides scanned with a 40× objective lens and classify them into 13 different morphologies of rat bone marrow cells. Each WSI was first divided into large tiles sized 3000×3000 pixels. Each large tile was then further subdivided into small tiles of 300×300 pixels. Subsequently, 200 large tiles were randomly selected, and 10 small tiles were randomly extracted from each of them. This procedure resulted in a dataset of 2,000 rat bone marrow smear images each 300×300 pixels in size. The rat bone marrow smear images were first processed using the SSD300 model to detect bone marrow cells. The bone marrow cells, which were initially classified into three types (bone marrow cell-1, bone marrow cell-2, and bone marrow cell-3) based on staining, were unified into a single type before being detected as rat bone marrow cells. In the detected bone marrow cell regions, areas with a confidence score of less than 0.85 or with their center located within 30 pixels of the image edge were regarded as undetected and were excluded from morphological classification. Five thousand regions were randomly selected from the detected rat bone marrow cell regions and used as target images for morphological classification. After resizing the images to 244×244 pixels using bilinear interpolation, the sequentially connected rat bone marrow cell morphology classification model, based on the VDSR-EfficientNetV2B0 model, was applied to classify the rat bone marrow cells into 13 morphologies. For morphology classification, instances with a maximum predicted probability of less than 0.4 were labeled as unclassified. The time required for bone marrow cell detection and morphology classification from the rat bone marrow smear images using this system was approximately 80 sec for classifying 5,000 cells per image. The system ultimately outputs all classified bone marrow cell images, their coordinates on the WSI, classification results, and prediction probabilities in a tabular format to facilitate expert review. Additionally, a histogram summarizing the number of classified cells is generated. Fig. 6 shows the workflow of the proposed system.

Workflow of the bone marrow cell detection and morphological classification system from rat bone marrow smear specimens.*: red blood cell, bone marrow cell-1, bone marrow cell-2, bone marrow cell-3 and platelet.
The development of a neural network model for classifying bone marrow morphology from bone marrow smear slide images scanned with a 40× objective lens has been demonstrated in this study. One of the major challenges in constructing a classification system for rat bone marrow cells is that no data are currently publicly available for AI training, particularly in nonclinical research. By utilizing domain-learning, and fine-tuning methods, we were able to reduce the reliance on large scale rat specific datasets and efficiently develop a practical classification model. This approach may provide a valuable framework for other domains where annotated datasets are limited. A key aspect of the proposed system is its ability to bridge the gap between low-magnification WSIs and the high-resolution imaging traditionally required for bone marrow morphology classification. Manual classification of bone marrow cells is traditionally performed under high magnification using a 100× objective lens, which is labor-intensive and relies heavily on expert skills. In contrast, this system initially converts low-magnification WSIs into pseudo-high-magnification images through the VDSR model. The resulting images are then classified with the EfficientNetV2B0 model, enabling accurate determination of bone marrow morphologies. This integrated approach allows for high-throughput analysis of WSIs while mitigating the limitations of low resolution, ultimately providing a robust system with high prediction accuracy. On the other hand, certain bone marrow cell morphologies, such as (Blast (BLA), Promyelocyte (PMO), Basophils (BAS), Plasmacyte (PLM)), are extremely rare. Due to the limited training data, the need to include human bone marrow cell data in some cases and use extensive data augmentation means that the generalization performance may be lower for these specific cell types. In the future, the model's performance will be improved by accumulating training data and updating the model parameters. Overall, this study highlights the potential for integrating WSI-based imaging with neural network models to establish a scalable, objective, and high-throughput system for bone marrow morphology analysis.
The authors are grateful to the following people for their support of this work: Takuya Matsui, Yusuke Mashimo, Motoki Ono, Takuya Abe, Chizuru Matsuura and our colleagues of Japan Tobacco Inc.
FundingNo funding was provided for the work.
Conflict of interestThe authors declare that there is no conflict of interest.
Data availabilityThe data in this study are included in the article/supplementary materials. Contact the corresponding author(s) directly to request the underlying data.
Author contributionsConceptualization: Naohito Yamada.
Investigation: Naohito Yamada, Yusuke Suzuki.
Writing – original draft: Naohito Yamada.
Writing – review & editing: Naohito Yamada, Taishi Shimazaki, Kyotaka Muta, Tadakazu Takahashi, Toshiyuki Shoda.
Ethical approval and consento to participateNot applicable.
Patient consent for publicationNot applicable.