The Use of Text Mining to Obtain a Historical Overview of Research on Therapeutic Drug Monitoring

Tetsuo Matsuzaki; Hiroyuki Mizoguchi; Kiyofumi Yamada

doi:10.1248/bpb.b24-00319

Abstract

Therapeutic drug monitoring (TDM) is a routine clinical practice used to individualize drug dosing to maintain drug efficacy and minimize the consequences of overexposure. TDM is applied to many drug classes, including immunosuppressants, antineoplastic agents, and antibiotics. Considerable effort has been made to establish routine TDM practices for each drug. However, because TDM has been developed within the context of specific drugs, there is insufficient understanding of historical trends within the field of TDM research as a whole. In this study, we employed text-mining approaches to explore trends in the TDM research field. We first performed a PubMed search to determine which drugs and drug classes have been extensively studied in the context of TDM. This investigation revealed that the most commonly studied drugs are tacrolimus, followed by cyclosporine and vancomycin. With regard to drug classes, most studies focused on immunosuppressants, antibiotics, and antineoplastic agents. We also subjected PubMed records of TDM-related studies to a series of text-mining pipelines. Our analyses revealed how TDM research has evolved over the years, thereby serving as a cornerstone for forecasting future research trends.

INTRODUCTION

Therapeutic drug monitoring (TDM) is a clinical practice used to individualize drug dosing by maintaining serum, plasma, or blood drug concentrations within a therapeutic range.^1–4) The past 30 years have seen tremendous advances in TDM research, resulting in the increased adoption of TDM for a broad range of drugs.^5–8) Owing to the nature of the research on drugs, TDM research has been conducted in the context of specific drugs; however, this poses a challenge to our comprehensive understanding of the trends in the TDM research field as a whole. Indeed, only a few investigations and reviews focusing specifically on this research field have been performed.

Furthermore, there has been a dramatic increase in the number of publications not only on TDM research, but in all fields of study. Because of the vast amount of published literature, researchers can no longer read all available articles, even within their own narrow disciplines. This challenge can be tackled using text-mining techniques.⁹⁾ Text mining is the process of extracting and processing text to derive meaningful insights from textual data. This makes it possible to perform a targeted navigation of the knowledge landscape, thereby helping guide researchers in their endeavors. Studies applying text-mining techniques have been increasingly reported.^10–12) Recently, web-based platforms that facilitate text-mining analysis have been developed. For example, Wei et al. developed PubTator, a web-based system that assists in literature curation.^13,14)

In the present study, we sought to derive an unbiased and comprehensive overview of the progression of the TDM research field and the perspective on future research. Toward this end, we harnessed the power of text-mining techniques. We first determined which drugs and drug classes have been extensively studied in the context of TDM by analyzing the numbers of relevant publications stored in PubMed (https://pubmed.ncbi.nlm.nih.gov). We also developed a Python module that enables the automated extraction of the titles and abstracts of selected articles by their PubMed IDs. Using this module, we obtained a dataset comprising the records of publications relevant to TDM and subjected it to text mining. These analyses revealed the evolution of the TDM research field, which will serve as a foundation for guiding future studies.

MATERIALS AND METHODS

Python Modules

The Python modules used in this study are available at https://github.com/Matsuzaki-T/TDM_text-mining.

Retrieval of Drug Names from Kyoto Encyclopedia of Genes and Genomes (KEGG) DRUG

The analysis performed in this study covered drugs deposited in the KEGG DRUG Database (https://www.genome.jp/kegg/drug/), a comprehensive information resource for drugs approved in Japan, the United States (U.S.), and Europe.¹⁵⁾ Their records (entry IDs, drug names, and drug efficacies) were automatically extracted using Python module 1.0.

Measurement of TDM-Related Publication Count per Drug

The drug list collected by module 1.0 was subjected to module 2.0, which performed a PubMed search using the terms “(“therapeutic drug monitoring”) AND (drug name [MeSH Terms]) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication]).” This resulted in a list that included the PubMed IDs of TDM-related publications for each drug. Of the entries in KEGG DRUG, we excluded human serum, blood cells, compounds that are not considered to be drugs (e.g., water), and those that are not administered for treatment (e.g., alcohol) (Supplementary Table S1).

To investigate which drug classes (determined by efficacies in KEGG DRUG) were well-studied in the context of TDM, the drug list obtained by module 2.0 was subjected to module 2.1.

Based on the number of TDM-related publications, we grouped the drugs into three groups: the major group (100 or more publications), moderate group (10–99 publications), and minor group (fewer than 10 publications). We investigated the publications in each group and used module 2.2 to visualize the results in the form of a Venn diagram.

Topic Word Extraction by Word Cloud

To perform automated text mining, we first attempted to retrieve text data from PubMed. One possible tool for this task is easyPubMed (https://cran.r-project.org/web/packages/easyPubMed/index.html), which is a readily available R interface that enables the automated extraction of PubMed records.^10,16) However, it was recently shown that easyPubMed yields fewer results than are provided directly by the PubMed website (reported in https://github.com/dami82/easyPubMed/issues/4). Therefore, we developed a new interface that accurately parses PubMed documents and retrieves their content. Module 3.0 enables the retrieval of the titles and abstracts of selected articles using their PubMed IDs by accessing PubTator Central (PTC), which stores and annotates the abstracts of all PubMed articles.^13,14) The titles and abstracts of articles stored in PTC are easily retrieved by queries using the characters “|t|” and “|a|” on the search results, respectively (a detailed description is presented on the PTC website, https://www.ncbi.nlm.nih.gov/research/pubtator/api.html). A list of PubMed IDs was obtained via the PubMed website using the search term “(“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” and then subjected to module 3.0, resulting in a list of the titles and abstracts of 13868 TDM-related publications.

The records were divided into four groups based on their publication dates: period I (before 2003), period II (2003–2013), period III (2014–2019), and period IV (2020–2022). These periods were defined such that the numbers of publications were similar (3000–4000 publications per period).

We retrieved biomedical entities from each title using the spaCy model “en_core_sci_lg,” which has been trained on biomedical text and enables the recognition of biomedical entities in documents.¹⁷⁾ We removed stop words, defined as entities that are commonly used in TDM research (e.g., therapy, monitoring, drug, and effect). The complete list of stop words is available in Supplementary Table S2.

The obtained list of biomedical entities was then subjected to word-cloud analysis (module 4.0), yielding a visual representation of the topic words.

Latent Dirichlet Allocation

Latent Dirichlet allocation (LDA) requires a document-term matrix as its input data structure.¹⁸⁾ Therefore, we began with converting the abstracts of the publications obtained by module 3.0 into a document-term matrix. We retrieved biomedical entities from each abstract as in the word-cloud analysis. The obtained document-term matrix was then converted into a bag-of-words representation using CountVectorizer from sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html). We also created a term frequency–inverse document frequency (TF–IDF)-weighted matrix using the relevant module from sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html). The two bag-of-words matrices with and without TF–IDF weighting were subjected to LDA using the LDA module provided by sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html). We conducted LDA with different numbers of topics, ranging from 10 to 20. These analyses were conducted using module 5.0.

Data Extraction

The data analyzed in this study were extracted from the KEGG DRUG Database and PubMed searches on August 20, 2024, unless otherwise specified.

RESULTS

Numbers of TDM-Related Publications per Year

In this study, we analyzed PubMed articles with publication dates before 2023. We first investigated all publications relevant to TDM and examined changes in the number of publications over time. A PubMed search using the term “(“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” yielded 13868 publications. The earliest publications, which reviewed a toxicology survey program, appeared in 1975.¹⁹⁾ Figure 1 shows the number of TDM-related publications published before 2023. The number of publications gradually increased, reaching 1119 in 2022.

Fig. 1. Numbers of TDM-Related Publications between 1970 and 2022

Investigation of TDM-Related Publication Count per Drug

Subsequently, we investigated which drugs and drug classes have been extensively studied in TDM-related research. Toward this end, we retrieved drug information from the KEGG DRUG Database.¹⁵⁾ The KEGG DRUG records (i.e., drug ID, drug names, and drug classes) were extracted using the Python-based module 1.0 (Fig. 2A). A PubMed search was performed for each selected drug name using the following search term: “(drug name [MeSH Terms]) AND (“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication]).” This was conducted in an automated manner using Python-based module 2.0 (Fig. 2A).

Fig. 2. Comprehensive Analysis of tdm-Related Publications per Drug

(A) Schematic of extraction of TDM-related publications per drug. Each drug name in the name column was subjected to a PubMed search, and all resulting PubMed IDs (PMIDs) were then merged. (B) Top 10 most studied drugs in TDM-related research. (C) Venn diagrams of TDM-related publications by drug group. (D) Top 10 most studied drug classes in TDM-related research.

A PubMed search showed that “globulin, immune” (D06458) was the top-ranked drug in terms of the number of publications (Supplementary Table S1). However, a PubMed search with the term “(globulin, immune [MeSH Terms]) AND (“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” yielded articles relevant to immunoassays and antibody therapies encompassing a broad range of diseases (Supplementary Table S3). Thus, we excluded “globulin, immune” from this ranking and regarded tacrolimus (650 publications) as the most frequently studied drug in TDM-related research, followed by cyclosporine (599 publications) and vancomycin (514 publications) (Fig. 2B, Supplementary Table S1).

We classified the drugs into three groups based on the number of publications: the major group (100 or more publications), moderate group (10–99 publications), and minor group (fewer than 10 publications). Based on the drug ID, the numbers of drugs categorized into the major, moderate, and minor groups were 62, 393, and 12026, respectively (Supplementary Table S1). It should be noted that there were overlaps among the drug records in KEGG DRUG; for example, D00752, D05094, D05095, and D05096 all indicate mycophenolic acid and its prodrugs. However, because applying manual data cleaning to such a large dataset was not practical, we instead counted the number of publications on each drug based on the drug ID. There were 5409 publications on drugs in the major group, accounting for 39.0% of the total of 13868 publications. On the other hand, there were 4495 (32.4%) and 1533 (11.1%) publications on drugs in the moderate and minor groups, respectively (Fig. 2C). That the group with the largest number of publications was the major group, which constitutes less than 1% of the drugs in the KEGG DRUG Database, indicates that only a small number of extensively studied drugs accounted for a large proportion of TDM-related publications.

Subsequently, when we focused on the drug classes (corresponding to efficacies in KEGG DRUG), we determined that the most studied class was immunosuppressant, followed by antibacterial (hereinafter referred to as antibiotics) and antineoplastic (hereinafter referred to as antineoplastic agents) (Fig. 2D). This result was derived based mainly on drugs in the major group, because these drugs constituted a major fraction of the subjects of the publications on each drug class: in particular, for immunosuppressants, 1757 out of 1808 publications were derived based on drugs in the major group. By contrast, for antivirals (hereinafter referred to as antiviral agents), the proportion of publications in the major group was relatively low (153 out of 570). This indicates that drugs with relatively small numbers of TDM-related studies constituted a significant portion of TDM research on antiviral agents, resulting in the number of publications for this drug class becoming one of the largest among the studied drug classes.

Visualization of Historical Changes in Topic Words in TDM Research

Afterward, we analyzed how the themes of TDM research have changed over time. To this end, we first extracted text data (i.e., titles and abstracts) from PubMed and then subjected them to text-mining analysis. As mentioned earlier, this task may be performed using easyPubMed; however, it has been reported that fewer PubMed records are obtained via easyPubMed than from the PubMed website (reported in https://github.com/dami82/easyPubMed/issues/4).^10,16) To overcome this problem, we developed a Python-based program to accurately retrieve the titles and abstracts of selected publications (Fig. 3A). Using this program, we retrieved the titles and abstracts of 13868 publications. We classified the publications into four groups based on their publication dates: period I (before 2003), period II (2003–2013), period III (2014–2019), and period IV (2020–2022). These periods were defined such that the publications were evenly distributed in number (3000–4000 publications per period).

Fig. 3. Historical Changes in Topic Words in TDM Research Field

(A) Overview of pipeline for retrieval of titles and abstracts of specific publications by PubMed ID. (B) Schematic of word-cloud analysis. (C) Word-cloud representation of topic words per period. (D) Modified list of stop words (Supplementary Table S4) was used for entity extraction and visualized as in (C). (E) Numbers of TDM-related publications for specific drugs between 1970 and 2022. I, II, III, and IV indicate periods of publication. (F) Publication counts for drugs in (E) were described in a single histogram.

Subsequently, we retrieved topic words in each period and visualized them using word clouds, a visualization module used to depict the frequencies of words in documents²⁰⁾ (Fig. 3B). We extracted biomedical entities from the title of each publication by leveraging the spaCy model “en_core_sci_lg.”¹⁷⁾ Figure 3C shows the word-cloud representations of the topic words in each period. Terms relevant to analytical methods, such as “chromatography,” “mass,” and “spectrometry,” were major topic words across all periods. The prevalence of these terms indicated studies that applied established LC or mass spectrometry (MS) methods or developed new ones, in both cases to monitor drug concentrations.^21,22) Because the predominance of method-related terms makes it difficult to capture the topics distinct to each period, we removed these terms as stop words (Supplementary Table S4). The reanalysis identified topic words that were distinct to each period, as follows (Fig. 3D). In period I, cyclosporine, theophylline, phenytoin, and carbamazepine were the most common topic words. In period II, theophylline and anticonvulsants (phenytoin and carbamazepine) were not retrieved as topic words, indicating that the number of TDM studies on theophylline and anticonvulsants decreased in this period. Among immunosuppressants, tacrolimus and mycophenolic acid were more common than cyclosporine. The frequency of tacrolimus further increased in period III, when it became one of the most predominant topic words. Conversely, the frequency of mycophenolic acid was decreased in period III, indicating that TDM of mycophenolic acid was already extensively studied in period II. Vancomycin also appeared among the most common topic words in period III. Other than drugs, terms relevant to inflammatory bowel disease (IBD) were featured in period III, indicating that TDM studies on drugs for IBD were extensively conducted in this period. In period IV, vancomycin and words relevant to IBD remained the most common topic words.

Drugs retrieved by word-cloud analysis in each period largely coincided with the number of publications (Figs. 3E, 3F). In period I, TDM-related publications on theophylline, phenytoin, and cyclosporine were more numerous than those on tacrolimus, mycophenolic acid, and vancomycin. In period II, TDM-related publications on tacrolimus and mycophenolic acid increased and surpassed in number those on theophylline, phenytoin, and cyclosporine. In period III, the number of TDM-related publications on mycophenolic acid gradually decreased, whereas that for vancomycin gradually increased, resulting in the predominance of tacrolimus and vancomycin. This predominance of tacrolimus and vancomycin then remained in period IV. Likewise, TDM-related publications on IBD became more frequent after period III (Supplementary Fig. S1). These results support the result of the word-cloud analysis.

Topic Model Analysis by LDA

Finally, we attempted to categorize publications based on their abstracts. To this end, we applied LDA, a well-established technique for topic modeling.¹⁸⁾ With LDA, each document can be tagged with topics, making it possible to classify documents based on their content. Figure 4A shows a schematic diagram of the analysis. We started by converting each abstract into a matrix of token counts. We extracted the biomedical entities from the abstract of each publication, as in Fig. 3. We also adopted the TF–IDF approach, which determines the weights of words in a document based on the frequency of each word within the document (term frequency) compared with its frequency in other documents (inverse document frequency). The two matrices, with and without TF–IDF weighting, were then subjected to LDA. First, we determined the optimal number of topics for LDA, between 10 and 20, by measuring perplexity. Perplexity is an indicator of the predictive power and generalizability of a topic model, with a lower perplexity indicating a better generalization ability.¹⁸⁾ For the matrix with TF–IDF weighting, the perplexity steadily increased when the number of topics increased from 10 to 20 (Supplementary Fig. S2, left). Conversely, for the matrix without TF–IDF weighting, the perplexity decreased as the number of topics increased (Supplementary Fig. S2, right). The perplexity scores without TF–IDF weighting were also significantly lower than those with TF–IDF weighting. Based on these results, we performed the LDA analysis with the number of topics K = 20 without TF–IDF weighting.

Fig. 4. Topic Modeling of TDM Research Field Using LDA

(A) Schematic of LDA. (B) Proposed titles and top 10 terms for each topic. (C) Changes in percentage of publication counts per topic.

We listed the group names alphabetically, with the title of each group derived based on the topic words retrieved by LDA and word-cloud analysis (Fig. 4B, Supplementary Fig. S3). Although we were able to assign a distinct title to most groups, we could not do the same for group B because of the presence of multiple topic words with ambiguous connections (i.e., words relevant to tuberculosis and carbamazepine).

Publications in groups A (MS) and R (LC) were both relevant to analytical methods. These two groups, when combined, included large numbers of publications across all periods. This result is consistent with the word-cloud analysis that featured method-related terms (Fig. 3C). Groups A and R showed contrasting trends in publication counts: publications in group A showed steady increases in number, whereas those in group R showed steady decreases in number across periods. This indicates that the focus of research has transitioned from LC to MS.

Next, we focused on topics other than groups A and R, as shown in Fig. 3D. In period I, groups I (immunoassay) and P (TDM using saliva samples) had large numbers of publications: 429 (13.5%) and 272 (8.6%), respectively (Fig. 4C, Supplementary Table S5). It was inferred that the large number of publications in group P was mainly due to TDM-related studies on theophylline, because the word cloud for group P in period I featured this drug (Supplementary Fig. S3P, period I).

In period II, group T (transplantation) became the largest group in terms of the number of publications. This was consistent with the result of the word-cloud analysis, which featured immunosuppressants (Fig. 3D, period II). TDM studies regarding the human immunodeficiency virus (HIV) greatly increased (approximately 200) compared to their corresponding numbers in period I (group J). The word-cloud analysis featured a broad range of anti-HIV agents, which were developed in late 1990s to 2000s, indicating that newly developed anti-HIV agents were studied in the context of TDM (Supplementary Fig. S3J, period II).^22,23) Other than groups J and T, group H (antifungal agents) included large numbers of publications in period II and maintained its publication counts in periods III and IV. Considering the large contribution of voriconazole in this group (Supplementary Fig. S3H), the increase in the number of publications for group H in period II was presumably elicited by the U.S. approval of voriconazole in 2002.²⁴⁾

In period III, groups E (vancomycin) and L (IBD) demonstrated increases of more than 200 in the numbers of publications compared to their corresponding numbers in period II. This trend is consistent with the results of the word-cloud analysis for period III (Fig. 3D). Although not to the same extent as for groups E and L, the number of publications in group C (β-lactam antibiotics) increased in period III. Increases in the publication counts of groups E (vancomycin) and C (β-lactam antibiotics) indicate that efforts have been focused on developing TDM for antibiotics. Period III also witnessed a decrease in publications for group J (HIV). Publication counts from PubMed showed a peak in period II and a decrease thereafter (Supplementary Fig. S4A). Thus, it can be inferred that most research efforts on TDM for anti-HIV drugs had been concentrated mainly in period II.

In period IV, groups E (vancomycin) and L (LBD) remained two of the largest in terms of the number of publications, which was consistent with the word-cloud analysis (Fig. 3D, period IV).

On the other hand, the small numbers of publications in group S (cancer) across all periods were not consistent with the result shown in Fig. 2D, in which antineoplastic agents ranked third in the number of publications. To investigate this, we performed a PubMed search and found that TDM studies in the field of cancer treatment research have become more frequent since period III (Supplementary Fig. S4B). A closer look at the word clouds for periods III and IV showed that terms related to cancer (e.g., “cancer,” “asparaginase,” and “imatinib”) were featured in groups F (polymorphism) and H (antifungal agents) (Supplementary Figs. S3F, S3H). This indicates that in periods III and IV, a substantial fraction of TDM studies in the field of cancer treatment research were linked with polymorphism or treatment of fungi, thereby dispersing TDM-related publications relevant to cancer treatment across several groups.

Altogether, we propose a historical overview of TDM research in Fig. 5.

Fig. 5. Evolution of TDM Research Field and Perspective on Future Topics

Major themes in each period and expected topics in next few years are described.

DISCUSSION

Despite the remarkable progress in TDM practice, there is insufficient information on how the field of TDM research has developed over the years. One recent study summarized an overview of TDM practices.²⁵⁾ However, that study was quite biased and not comprehensive because the drugs included therein were limited to commonly used ones. To obtain an unbiased and comprehensive overview of TDM research trends, we applied a series of text-mining approaches.^10–12) To our knowledge, this is the first study to employ text-mining techniques to provide a historical overview of TDM research trends.

The number of publications related to TDM increased steadily (Fig. 1). This indicates an upward trend in TDM research activity. By applying an automated system to perform PubMed searches, we found that according to the number of publications, tacrolimus was the most studied drug in TDM-related research (Fig. 2B). The top-ranked drugs were all drugs for TDM, validating the PubMed search term. In terms of drug classes, immunosuppressants were the most studied class, followed by antibiotics and antineoplastic agents (Fig. 2D). The result shown in Fig. 2D consists mainly of the drug classes of the top-ranked drugs shown in Fig. 2B, indicating that the TDM research field is constituted by only a small portion of all drugs. Indeed, publications with counts ≥100 (the major group) represented less than 1% of the drugs in KEGG DRUG but constituted 39.0% of TDM-related publications (Fig. 2C and Supplementary Table S1). Together, we conclude that only a small number of drugs have been the focus of most TDM studies.

Subsequently, we adopted well-established text-mining techniques to determine the evolution of the TDM research field over time. The result of word-cloud analysis and LDA identified distinct topics in each period, enabling us to understand when the top-ranked drugs shown in Fig. 2B were extensively studied.

The main conclusion of this study is a historical overview of TDM research, which is summarized in Fig. 5. In period I (before 2003), TDM research focused on theophylline, anticonvulsants, immunosuppressants (mainly cyclosporine), and TDM methods. The first two topics showed downward trends, whereas the last two topics remained important across all periods. In period II (2003–2013), TDM studies on immunosuppressants were focused on mycophenolic acid and tacrolimus, and on cyclosporine. Period II also witnessed increases in the number of publications on drugs classified as antifungal agents (group H) and for the treatment of HIV (group J), although the absolute numbers of publications were not as large as that for immunosuppressants. In period III (2014–2019), TDM studies on HIV treatment became less frequent, whereas those on antifungal agents remained constant in number. With regard to immunosuppressants, publications on cyclosporine and mycophenolic acid decreased in number, and studies focused mainly on tacrolimus. This reflects the accumulating evidence of tacrolimus as a first-line calcineurin inhibitor.^26,27) On the other hand, publications on IBD, antibiotics, and antineoplastic agents increased in number and became one of the major topics of TDM-related research, a trend that continued to period IV (2020–2022). Word-cloud representations of group L (IBD) featured biopharmaceuticals (e.g., infliximab and adalimumab, Supplementary Fig. S3L). This reflects the TDM practice in biologic therapies since 2010s, although the practice is not common in several countries including Japan.^4,6,28,29) Terms relevant to rheumatoid arthritis (RA) were not as featured as IBD, despite the use of infliximab and adalimumab in the treatment of RA (Fig. 3D, Supplementary Fig. S3L). This indicates that TDM studies on infliximab and adalimumab have been conducted mainly in the context of IBD treatment. TDM studies on antibiotics focused mainly on vancomycin, because the publication counts of group E (vancomycin) were larger than those of group C (β-lactam antibiotics). For group C, a plausible reason for the increase in the number of publications is the need for optimization of antibiotic use to combat the global concern of antimicrobial resistance, which was raised by The Review on Antimicrobial Resistance in 2014.³⁰⁾ The increase in the number of TDM studies on vancomycin presumably resulted from the first consensus guidelines for vancomycin TDM published in 2009, which led to research studies on evaluating these guidelines.^31–34) For antineoplastic agents, asparaginase (group F), imatinib (group H), methotrexate (group S), and 5-fluorouracil (group S) were featured in the word cloud (Supplementary Figs. S3F, S3H, S3S, periods III and IV). The first three drugs are used for the treatment of leukemia, indicating that among recent TDM-related studies in the field of cancer treatment research, many have focused on the treatment of leukemia.

In addition to the longitudinal trends in TDM research, our analysis also shed light on future research trends. We infer that TDM studies on IBD, antibiotics (mainly vancomycin), and antineoplastic agents, which have been major topics since period III, will be placed at the center of TDM research in the next few years, because publication counts in PubMed have shown upward trends (Fig. 3E and Supplementary Figs. S1 and S4B). Conversely, publication counts for tacrolimus and cyclosporine, which are the most studied drugs at present (Fig. 2B), have shown a moderate increase and remained constant, respectively (Fig. 3E). Based on this trend, taken together with the large numbers of TDM-related publications regarding vancomycin at present, the currently top-ranked drugs (tacrolimus and cyclosporine) will eventually be replaced by vancomycin. Meanwhile, TDM studies on antifungal agents have had relatively large numbers of publications since period II. The word frequencies of newly approved drugs, posaconazole and isavuconazole, have increased steadily since period II and became the most frequent words in period IV (Supplementary Fig. S3H). This indicates that the TDM of these new drugs has garnered interest. The emergence of posaconazole and isavuconazole implies upward trends in the TDM of antifungal agents, and thus, we speculate that this class of drugs could potentially become a major topic of research.

The limitations of this study are as follows. First, we retrieved information on prescription drugs from the KEGG DRUG Database. However, as with other drug databases, KEGG DRUG is not comprehensive. Although we annotated drug classes based on the records in KEGG DRUG, these records were sometimes inconsistent with those used in clinical settings. For instance, during the preparation of this manuscript, thalidomide (D00754) was recorded as an antibiotic and antineoplastic agent, although it is not used as an antibiotic in clinical settings.^35,36) Second, we analyzed only the publications recorded in PubMed. Evaluating publications recorded in other repositories, such as Google Scholar (https://scholar.google.com) and Web of Science (https://clarivate.com/webofsciencegroup/solutions/web-of-science/), will provide a more accurate and comprehensive historical overview of TDM research. It should also be noted that gray literature and unpublished studies were not addressed in this study, which may have introduced some bias. Third, we queried TDM-related publications based on MeSH indexing, and therefore, this study excluded publications that were not assigned MeSH terms. Indeed, only 66.9% (9281/13868) of TDM-related publications were recovered by the drug name search, indicating that a large number of publications were not assigned MeSH terms for the drug names (Fig. 2C).

Collectively, we used text-mining tools to provide an overview of how the TDM research field evolved over time, thereby serving as a foundation for future studies.

Acknowledgments

This study was supported by JSPS KAKENHI Grants JP24K17995 (T.M.), JP23H02669 (K.Y.), and JP22K19749 (H.M.), Chukyo Longevity Medical Research and Promotion Foundation (T.M.), and Morinomiyako Medical Research Foundation (T.M.).

Author Contributions

T.M. designed the project, developed the modules, analyzed and interpreted the data, prepared the figures, and wrote the manuscript. H.M. and K.Y. analyzed and interpreted the data and wrote the manuscript.

Conflict of Interest

The authors declare no conflict of interest.

Supplementary Materials

This article contains supplementary materials. The following files are available free of charge: Supplementary Tables S1–S5 and Supplementary Figs. S1–S4. The dataset and code for the modules used to derive the results in this study are available at https://github.com/Matsuzaki-T/TDM_text-mining.

REFERENCES

1) Rybak MJ, Le J, Lodise TP, Levine DP, Bradley JS, Liu C, Mueller BA, Pai MP, Wong-Beringer A, Rotschafer JC, Rodvold KA, Maples HD, Lomaestro B. Therapeutic monitoring of vancomycin for serious methicillin-resistant Staphylococcus aureus infections: a revised consensus guideline and review by the American Society of Health-system Pharmacists, the Infectious Diseases Society of America, the Pediatric Infectious Diseases Society, and the Society of Infectious Diseases Pharmacists. Clin. Infect. Dis., 71, 1361–1364 (2020).
2) Miyai T, Imai S, Yoshimura E, Kashiwagi H, Sato Y, Ueno H, Takekuma Y, Sugawara M. Machine learning-based model for estimating vancomycin maintenance dose to target the area under the concentration curve of 400–600 mg·h/L in Japanese patients. Biol. Pharm. Bull., 45, 1332–1339 (2022).
3) Matsuzaki T, Kato Y, Mizoguchi H, Yamada K. A machine learning model that emulates experts’ decision making in vancomycin initial dose planning. J. Pharmacol. Sci., 148, 358–363 (2022).
4) Feuerstein JD, Nguyen GC, Kupfer SS, Falck-Ytter Y, Singh S, Gerson L, Hirano I, Rubenstein JH, Smalley WE, Stollman N, Sultan S, Vege SS, Wani SB, Weinberg D, Yang YX. American Gastroenterological Association Institute guideline on therapeutic drug monitoring in inflammatory bowel disease. Gastroenterology, 153, 827–834 (2017).
5) Miura M. Therapeutic drug monitoring of imatinib, nilotinib, and dasatinib for patients with chronic myeloid leukemia. Biol. Pharm. Bull., 38, 645–654 (2015).
6) Yonezawa A. Therapeutic drug monitoring of antibody drugs. Biol. Pharm. Bull., 45, 843–846 (2022).
7) Muraki Y, Koizumi R, Kusama Y, Inose R, Ishikane M, Ohmagari N. Necessity for a system implementing therapeutic drug monitoring in outpatient settings based on the actual use of voriconazole using the National Database of Health Insurance Claims and Specific Health Checkups of Japan: a descriptive epidemiological study. Biol. Pharm. Bull., 46, 1490–1493 (2023).
8) Yamamoto Y, Usui N, Kagawa Y, Imai K. Time-course changes in lamotrigine concentration after addition of valproate and the safety and long-term tolerability of lamotrigine-valproate combination therapy. Biol. Pharm. Bull., 47, 43–48 (2024).
9) Extance A. How AI technology can tame the scientific literature. Nature, 561, 273–274 (2018).
10) Yim WW-Y, Kurikawa Y, Mizushima N. An exploratory text analysis of the autophagy research field. Autophagy, 18, 1648–1661 (2022).
11) Klang E, Soffer S, Alper L, Shimon O, Barash Y, Davidov Y, Likhter M, Cohen-Ezra O, Ben Yakov G, Ben-Ari Z. Research trends analysis of chronic hepatitis C versus nonalcoholic fatty liver disease: a literature review text-mining analysis of publications. Health Sci. Rep., 5, e805 (2022).
12) Kim Y-M. Discovering major opioid-related research themes over time: a text mining technique. AMIA Jt. Summits Transl. Sci. Proc., 2019, 751–760 (2019).
13) Wei CH, Kao HY, Lu Z. PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res., 41 (W1), W518–W522 (2013).
14) Wei CH, Allot A, Leaman R, Lu Z. PubTator Central: automated concept annotation for biomedical full text articles. Nucleic Acids Res., 47 (W1), W587–W593 (2019).
15) Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res., 44 (D1), D457–D462 (2016).
16) Mehran R, Kumar A, Bansal A, Shariff M, Gulati M, Kalra A. Gender and disparity in first authorship in cardiology randomized clinical trials. JAMA Netw. Open, 4, e211043 (2021).
17) Neumann M, King D, Beltagy I, Ammar W. ScispaCy: fast and robust models for biomedical natural language processing. Proceedings of the 18th BioNLP Workshop and Shared Task. (Demner-Fushman D, Cohen KB, Ananiadou S, Tsujii J eds.) Association for Computational Linguistics, Kerrville, Texas, pp. 319–327 (2019).
18) Blei DM, Ng AY, Edu JB. Latent Dirichlet allocation. J. Mach. Learn. Res., 13, 993–1022 (2003).
19) Sohn D, Baden M. The first year of the toxicology program. Am. J. Clin. Pathol., 63 (6 SUPPL), 1012–1015 (1975).
20) Atenstaedt R. Word cloud analysis of the BJGP. Br. J. Gen. Pract., 62, 148 (2012).
21) Zanchetta M, Iacuzzi V, Posocco B, Bortolin G, Poetto AS, Orleni M, Canil G, Guardascione M, Foltran L, Fanotto V, Puglisi F, Gagno S, Toffoli G. A rapid, simple and sensitive LC-MS/MS method for lenvatinib quantification in human plasma for therapeutic drug monitoring. PLOS ONE, 16, e0259137 (2021).
22) Takahashi M, Yoshida M, Oki T, Okumura N, Suzuki T, Kaneda T. Conventional HPLC method used for simultaneous determination of the seven HIV protease inhibitors and nonnucleoside reverse transcription inhibitor efavirenz in human plasma. Biol. Pharm. Bull., 28, 1286–1290 (2005).
23) Kredo T, Van der Walt J-S, Siegfried N, Cohen K. Therapeutic drug monitoring of antiretrovirals for people with HIV. Cochrane Libr., CD007268 (2009).
24) Cecil JA, Wenzel RP. Voriconazole: a broad-spectrum triazole for the treatment of invasive fungal infections. Expert Rev. Hematol., 2, 237–254 (2009).
25) Fang Z, Zhang H, Guo J, Guo J. Overview of therapeutic drug monitoring and clinical practice. Talanta, 266, 124996 (2024).
26) Eckardt K-U, Kasiske BL, Zeier MG. Special issue: KDIGO clinical practice guideline for the care of kidney transplant recipients. Am. J. Transplant., 9, S1–S155 (2009).
27) Muduma G, Saunders R, Odeyemi I, Pollock RF. Systematic review and meta-analysis of tacrolimus versus ciclosporin as primary immunosuppression after liver transplant. PLOS ONE, 11, e0160421 (2016).
28) Afif W, Loftus EV Jr, Faubion WA, Kane SV, Bruining DH, Hanson KA, Sandborn WJ. Clinical utility of measuring infliximab and human anti-chimeric antibody concentrations in patients with inflammatory bowel disease. Am. J. Gastroenterol., 105, 1133–1139 (2010).
29) Vande Casteele N, Herfarth H, Katz J, Falck-Ytter Y, Singh S. American Gastroenterological Association Institute technical review on the role of therapeutic drug monitoring in the management of inflammatory bowel diseases. Gastroenterology, 153, 835–857.e6 (2017).
30) O’Neill J. Antimicrobial resistance: tackling a crisis for the health and wealth of nations. Rev. Antimicrob. Resist. (2014). https://amr-review.org/sites/default/files/AMR%20Review%20Paper%20-%20Tackling%20a%20crisis%20for%20the%20health%20and%20wealth%20of%20nations_1.pdf
31) Suzuki Y, Kawasaki K, Sato Y, Tokimatsu I, Itoh H, Hiramatsu K, Takeyama M, Kadota JI. Is peak concentration needed in therapeutic drug monitoring of vancomycin? A pharmacokinetic-pharmacodynamic analysis in patients with methicillin-resistant staphylococcus aureus pneumonia. Chemotherapy, 58, 308–312 (2012).
32) Rybak M, Lomaestro B, Rotschafer JC, Moellering R Jr, Craig W, Billeter M, Dalovisio JR, Levine DP. Therapeutic monitoring of vancomycin in adult patients: a consensus review of the American Society of Health-System Pharmacists, the Infectious Diseases Society of America, and the Society of Infectious Diseases Pharmacists. Am. J. Health Syst. Pharm., 66, 82–98 (2009).
33) Neely MN, Kato L, Youn G, Kraler L, Bayard D, Van Guilder M, Schumitzky A, Yamada W, Jones B, Minejima E. Prospective trial on the use of trough concentration versus area under the curve to determine therapeutic vancomycin dosing. Antimicrob. Agents Chemother., 62, e02042-17 (2018).
34) Zasowski EJ, Murray KP, Trinh TD, Finch NA, Pogue JM, Mynatt RP, Rybak MJ. Identification of vancomycin exposure-toxicity thresholds in hospitalized patients receiving intravenous vancomycin. Antimicrob. Agents Chemother., 62, e01684-17 (2017).
35) Singhal S, Mehta J, Desikan R, Ayers D, Roberson P, Eddlemon P, Munshi N, Anaissie E, Wilson C, Dhodapkar M, Zeddis J, Barlogie B. Antitumor activity of thalidomide in refractory multiple myeloma. N. Engl. J. Med., 341, 1565–1571 (1999).
36) Kamikawa R, Ikawa K, Morikawa N, Asaoku H, Iwato K, Sasaki A. The pharmacokinetics of low-dose thalidomide in Japanese patients with refractory multiple myeloma. Biol. Pharm. Bull., 29, 2331–2334 (2006).

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）