2024 年 47 巻 11 号 p. 1883-1892
Therapeutic drug monitoring (TDM) is a routine clinical practice used to individualize drug dosing to maintain drug efficacy and minimize the consequences of overexposure. TDM is applied to many drug classes, including immunosuppressants, antineoplastic agents, and antibiotics. Considerable effort has been made to establish routine TDM practices for each drug. However, because TDM has been developed within the context of specific drugs, there is insufficient understanding of historical trends within the field of TDM research as a whole. In this study, we employed text-mining approaches to explore trends in the TDM research field. We first performed a PubMed search to determine which drugs and drug classes have been extensively studied in the context of TDM. This investigation revealed that the most commonly studied drugs are tacrolimus, followed by cyclosporine and vancomycin. With regard to drug classes, most studies focused on immunosuppressants, antibiotics, and antineoplastic agents. We also subjected PubMed records of TDM-related studies to a series of text-mining pipelines. Our analyses revealed how TDM research has evolved over the years, thereby serving as a cornerstone for forecasting future research trends.
Therapeutic drug monitoring (TDM) is a clinical practice used to individualize drug dosing by maintaining serum, plasma, or blood drug concentrations within a therapeutic range.1–4) The past 30 years have seen tremendous advances in TDM research, resulting in the increased adoption of TDM for a broad range of drugs.5–8) Owing to the nature of the research on drugs, TDM research has been conducted in the context of specific drugs; however, this poses a challenge to our comprehensive understanding of the trends in the TDM research field as a whole. Indeed, only a few investigations and reviews focusing specifically on this research field have been performed.
Furthermore, there has been a dramatic increase in the number of publications not only on TDM research, but in all fields of study. Because of the vast amount of published literature, researchers can no longer read all available articles, even within their own narrow disciplines. This challenge can be tackled using text-mining techniques.9) Text mining is the process of extracting and processing text to derive meaningful insights from textual data. This makes it possible to perform a targeted navigation of the knowledge landscape, thereby helping guide researchers in their endeavors. Studies applying text-mining techniques have been increasingly reported.10–12) Recently, web-based platforms that facilitate text-mining analysis have been developed. For example, Wei et al. developed PubTator, a web-based system that assists in literature curation.13,14)
In the present study, we sought to derive an unbiased and comprehensive overview of the progression of the TDM research field and the perspective on future research. Toward this end, we harnessed the power of text-mining techniques. We first determined which drugs and drug classes have been extensively studied in the context of TDM by analyzing the numbers of relevant publications stored in PubMed (https://pubmed.ncbi.nlm.nih.gov). We also developed a Python module that enables the automated extraction of the titles and abstracts of selected articles by their PubMed IDs. Using this module, we obtained a dataset comprising the records of publications relevant to TDM and subjected it to text mining. These analyses revealed the evolution of the TDM research field, which will serve as a foundation for guiding future studies.
The Python modules used in this study are available at https://github.com/Matsuzaki-T/TDM_text-mining.
Retrieval of Drug Names from Kyoto Encyclopedia of Genes and Genomes (KEGG) DRUGThe analysis performed in this study covered drugs deposited in the KEGG DRUG Database (https://www.genome.jp/kegg/drug/), a comprehensive information resource for drugs approved in Japan, the United States (U.S.), and Europe.15) Their records (entry IDs, drug names, and drug efficacies) were automatically extracted using Python module 1.0.
Measurement of TDM-Related Publication Count per DrugThe drug list collected by module 1.0 was subjected to module 2.0, which performed a PubMed search using the terms “(“therapeutic drug monitoring”) AND (drug name [MeSH Terms]) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication]).” This resulted in a list that included the PubMed IDs of TDM-related publications for each drug. Of the entries in KEGG DRUG, we excluded human serum, blood cells, compounds that are not considered to be drugs (e.g., water), and those that are not administered for treatment (e.g., alcohol) (Supplementary Table S1).
To investigate which drug classes (determined by efficacies in KEGG DRUG) were well-studied in the context of TDM, the drug list obtained by module 2.0 was subjected to module 2.1.
Based on the number of TDM-related publications, we grouped the drugs into three groups: the major group (100 or more publications), moderate group (10–99 publications), and minor group (fewer than 10 publications). We investigated the publications in each group and used module 2.2 to visualize the results in the form of a Venn diagram.
Topic Word Extraction by Word CloudTo perform automated text mining, we first attempted to retrieve text data from PubMed. One possible tool for this task is easyPubMed (https://cran.r-project.org/web/packages/easyPubMed/index.html), which is a readily available R interface that enables the automated extraction of PubMed records.10,16) However, it was recently shown that easyPubMed yields fewer results than are provided directly by the PubMed website (reported in https://github.com/dami82/easyPubMed/issues/4). Therefore, we developed a new interface that accurately parses PubMed documents and retrieves their content. Module 3.0 enables the retrieval of the titles and abstracts of selected articles using their PubMed IDs by accessing PubTator Central (PTC), which stores and annotates the abstracts of all PubMed articles.13,14) The titles and abstracts of articles stored in PTC are easily retrieved by queries using the characters “|t|” and “|a|” on the search results, respectively (a detailed description is presented on the PTC website, https://www.ncbi.nlm.nih.gov/research/pubtator/api.html). A list of PubMed IDs was obtained via the PubMed website using the search term “(“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” and then subjected to module 3.0, resulting in a list of the titles and abstracts of 13868 TDM-related publications.
The records were divided into four groups based on their publication dates: period I (before 2003), period II (2003–2013), period III (2014–2019), and period IV (2020–2022). These periods were defined such that the numbers of publications were similar (3000–4000 publications per period).
We retrieved biomedical entities from each title using the spaCy model “en_core_sci_lg,” which has been trained on biomedical text and enables the recognition of biomedical entities in documents.17) We removed stop words, defined as entities that are commonly used in TDM research (e.g., therapy, monitoring, drug, and effect). The complete list of stop words is available in Supplementary Table S2.
The obtained list of biomedical entities was then subjected to word-cloud analysis (module 4.0), yielding a visual representation of the topic words.
Latent Dirichlet AllocationLatent Dirichlet allocation (LDA) requires a document-term matrix as its input data structure.18) Therefore, we began with converting the abstracts of the publications obtained by module 3.0 into a document-term matrix. We retrieved biomedical entities from each abstract as in the word-cloud analysis. The obtained document-term matrix was then converted into a bag-of-words representation using CountVectorizer from sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html). We also created a term frequency–inverse document frequency (TF–IDF)-weighted matrix using the relevant module from sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html). The two bag-of-words matrices with and without TF–IDF weighting were subjected to LDA using the LDA module provided by sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html). We conducted LDA with different numbers of topics, ranging from 10 to 20. These analyses were conducted using module 5.0.
Data ExtractionThe data analyzed in this study were extracted from the KEGG DRUG Database and PubMed searches on August 20, 2024, unless otherwise specified.
In this study, we analyzed PubMed articles with publication dates before 2023. We first investigated all publications relevant to TDM and examined changes in the number of publications over time. A PubMed search using the term “(“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” yielded 13868 publications. The earliest publications, which reviewed a toxicology survey program, appeared in 1975.19) Figure 1 shows the number of TDM-related publications published before 2023. The number of publications gradually increased, reaching 1119 in 2022.
Subsequently, we investigated which drugs and drug classes have been extensively studied in TDM-related research. Toward this end, we retrieved drug information from the KEGG DRUG Database.15) The KEGG DRUG records (i.e., drug ID, drug names, and drug classes) were extracted using the Python-based module 1.0 (Fig. 2A). A PubMed search was performed for each selected drug name using the following search term: “(drug name [MeSH Terms]) AND (“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication]).” This was conducted in an automated manner using Python-based module 2.0 (Fig. 2A).
(A) Schematic of extraction of TDM-related publications per drug. Each drug name in the name column was subjected to a PubMed search, and all resulting PubMed IDs (PMIDs) were then merged. (B) Top 10 most studied drugs in TDM-related research. (C) Venn diagrams of TDM-related publications by drug group. (D) Top 10 most studied drug classes in TDM-related research.
A PubMed search showed that “globulin, immune” (D06458) was the top-ranked drug in terms of the number of publications (Supplementary Table S1). However, a PubMed search with the term “(globulin, immune [MeSH Terms]) AND (“therapeutic drug monitoring”) AND (“1900/1/1” [Date—Publication]: “2022/12/31” [Date—Publication])” yielded articles relevant to immunoassays and antibody therapies encompassing a broad range of diseases (Supplementary Table S3). Thus, we excluded “globulin, immune” from this ranking and regarded tacrolimus (650 publications) as the most frequently studied drug in TDM-related research, followed by cyclosporine (599 publications) and vancomycin (514 publications) (Fig. 2B, Supplementary Table S1).
We classified the drugs into three groups based on the number of publications: the major group (100 or more publications), moderate group (10–99 publications), and minor group (fewer than 10 publications). Based on the drug ID, the numbers of drugs categorized into the major, moderate, and minor groups were 62, 393, and 12026, respectively (Supplementary Table S1). It should be noted that there were overlaps among the drug records in KEGG DRUG; for example, D00752, D05094, D05095, and D05096 all indicate mycophenolic acid and its prodrugs. However, because applying manual data cleaning to such a large dataset was not practical, we instead counted the number of publications on each drug based on the drug ID. There were 5409 publications on drugs in the major group, accounting for 39.0% of the total of 13868 publications. On the other hand, there were 4495 (32.4%) and 1533 (11.1%) publications on drugs in the moderate and minor groups, respectively (Fig. 2C). That the group with the largest number of publications was the major group, which constitutes less than 1% of the drugs in the KEGG DRUG Database, indicates that only a small number of extensively studied drugs accounted for a large proportion of TDM-related publications.
Subsequently, when we focused on the drug classes (corresponding to efficacies in KEGG DRUG), we determined that the most studied class was immunosuppressant, followed by antibacterial (hereinafter referred to as antibiotics) and antineoplastic (hereinafter referred to as antineoplastic agents) (Fig. 2D). This result was derived based mainly on drugs in the major group, because these drugs constituted a major fraction of the subjects of the publications on each drug class: in particular, for immunosuppressants, 1757 out of 1808 publications were derived based on drugs in the major group. By contrast, for antivirals (hereinafter referred to as antiviral agents), the proportion of publications in the major group was relatively low (153 out of 570). This indicates that drugs with relatively small numbers of TDM-related studies constituted a significant portion of TDM research on antiviral agents, resulting in the number of publications for this drug class becoming one of the largest among the studied drug classes.
Visualization of Historical Changes in Topic Words in TDM ResearchAfterward, we analyzed how the themes of TDM research have changed over time. To this end, we first extracted text data (i.e., titles and abstracts) from PubMed and then subjected them to text-mining analysis. As mentioned earlier, this task may be performed using easyPubMed; however, it has been reported that fewer PubMed records are obtained via easyPubMed than from the PubMed website (reported in https://github.com/dami82/easyPubMed/issues/4).10,16) To overcome this problem, we developed a Python-based program to accurately retrieve the titles and abstracts of selected publications (Fig. 3A). Using this program, we retrieved the titles and abstracts of 13868 publications. We classified the publications into four groups based on their publication dates: period I (before 2003), period II (2003–2013), period III (2014–2019), and period IV (2020–2022). These periods were defined such that the publications were evenly distributed in number (3000–4000 publications per period).
(A) Overview of pipeline for retrieval of titles and abstracts of specific publications by PubMed ID. (B) Schematic of word-cloud analysis. (C) Word-cloud representation of topic words per period. (D) Modified list of stop words (Supplementary Table S4) was used for entity extraction and visualized as in (C). (E) Numbers of TDM-related publications for specific drugs between 1970 and 2022. I, II, III, and IV indicate periods of publication. (F) Publication counts for drugs in (E) were described in a single histogram.
Subsequently, we retrieved topic words in each period and visualized them using word clouds, a visualization module used to depict the frequencies of words in documents20) (Fig. 3B). We extracted biomedical entities from the title of each publication by leveraging the spaCy model “en_core_sci_lg.”17) Figure 3C shows the word-cloud representations of the topic words in each period. Terms relevant to analytical methods, such as “chromatography,” “mass,” and “spectrometry,” were major topic words across all periods. The prevalence of these terms indicated studies that applied established LC or mass spectrometry (MS) methods or developed new ones, in both cases to monitor drug concentrations.21,22) Because the predominance of method-related terms makes it difficult to capture the topics distinct to each period, we removed these terms as stop words (Supplementary Table S4). The reanalysis identified topic words that were distinct to each period, as follows (Fig. 3D). In period I, cyclosporine, theophylline, phenytoin, and carbamazepine were the most common topic words. In period II, theophylline and anticonvulsants (phenytoin and carbamazepine) were not retrieved as topic words, indicating that the number of TDM studies on theophylline and anticonvulsants decreased in this period. Among immunosuppressants, tacrolimus and mycophenolic acid were more common than cyclosporine. The frequency of tacrolimus further increased in period III, when it became one of the most predominant topic words. Conversely, the frequency of mycophenolic acid was decreased in period III, indicating that TDM of mycophenolic acid was already extensively studied in period II. Vancomycin also appeared among the most common topic words in period III. Other than drugs, terms relevant to inflammatory bowel disease (IBD) were featured in period III, indicating that TDM studies on drugs for IBD were extensively conducted in this period. In period IV, vancomycin and words relevant to IBD remained the most common topic words.
Drugs retrieved by word-cloud analysis in each period largely coincided with the number of publications (Figs. 3E, 3F). In period I, TDM-related publications on theophylline, phenytoin, and cyclosporine were more numerous than those on tacrolimus, mycophenolic acid, and vancomycin. In period II, TDM-related publications on tacrolimus and mycophenolic acid increased and surpassed in number those on theophylline, phenytoin, and cyclosporine. In period III, the number of TDM-related publications on mycophenolic acid gradually decreased, whereas that for vancomycin gradually increased, resulting in the predominance of tacrolimus and vancomycin. This predominance of tacrolimus and vancomycin then remained in period IV. Likewise, TDM-related publications on IBD became more frequent after period III (Supplementary Fig. S1). These results support the result of the word-cloud analysis.
Topic Model Analysis by LDAFinally, we attempted to categorize publications based on their abstracts. To this end, we applied LDA, a well-established technique for topic modeling.18) With LDA, each document can be tagged with topics, making it possible to classify documents based on their content. Figure 4A shows a schematic diagram of the analysis. We started by converting each abstract into a matrix of token counts. We extracted the biomedical entities from the abstract of each publication, as in Fig. 3. We also adopted the TF–IDF approach, which determines the weights of words in a document based on the frequency of each word within the document (term frequency) compared with its frequency in other documents (inverse document frequency). The two matrices, with and without TF–IDF weighting, were then subjected to LDA. First, we determined the optimal number of topics for LDA, between 10 and 20, by measuring perplexity. Perplexity is an indicator of the predictive power and generalizability of a topic model, with a lower perplexity indicating a better generalization ability.18) For the matrix with TF–IDF weighting, the perplexity steadily increased when the number of topics increased from 10 to 20 (Supplementary Fig. S2, left). Conversely, for the matrix without TF–IDF weighting, the perplexity decreased as the number of topics increased (Supplementary Fig. S2, right). The perplexity scores without TF–IDF weighting were also significantly lower than those with TF–IDF weighting. Based on these results, we performed the LDA analysis with the number of topics K = 20 without TF–IDF weighting.
(A) Schematic of LDA. (B) Proposed titles and top 10 terms for each topic. (C) Changes in percentage of publication counts per topic.
We listed the group names alphabetically, with the title of each group derived based on the topic words retrieved by LDA and word-cloud analysis (Fig. 4B, Supplementary Fig. S3). Although we were able to assign a distinct title to most groups, we could not do the same for group B because of the presence of multiple topic words with ambiguous connections (i.e., words relevant to tuberculosis and carbamazepine).
Publications in groups A (MS) and R (LC) were both relevant to analytical methods. These two groups, when combined, included large numbers of publications across all periods. This result is consistent with the word-cloud analysis that featured method-related terms (Fig. 3C). Groups A and R showed contrasting trends in publication counts: publications in group A showed steady increases in number, whereas those in group R showed steady decreases in number across periods. This indicates that the focus of research has transitioned from LC to MS.
Next, we focused on topics other than groups A and R, as shown in Fig. 3D. In period I, groups I (immunoassay) and P (TDM using saliva samples) had large numbers of publications: 429 (13.5%) and 272 (8.6%), respectively (Fig. 4C, Supplementary Table S5). It was inferred that the large number of publications in group P was mainly due to TDM-related studies on theophylline, because the word cloud for group P in period I featured this drug (Supplementary Fig. S3P, period I).
In period II, group T (transplantation) became the largest group in terms of the number of publications. This was consistent with the result of the word-cloud analysis, which featured immunosuppressants (Fig. 3D, period II). TDM studies regarding the human immunodeficiency virus (HIV) greatly increased (approximately 200) compared to their corresponding numbers in period I (group J). The word-cloud analysis featured a broad range of anti-HIV agents, which were developed in late 1990s to 2000s, indicating that newly developed anti-HIV agents were studied in the context of TDM (Supplementary Fig. S3J, period II).22,23) Other than groups J and T, group H (antifungal agents) included large numbers of publications in period II and maintained its publication counts in periods III and IV. Considering the large contribution of voriconazole in this group (Supplementary Fig. S3H), the increase in the number of publications for group H in period II was presumably elicited by the U.S. approval of voriconazole in 2002.24)
In period III, groups E (vancomycin) and L (IBD) demonstrated increases of more than 200 in the numbers of publications compared to their corresponding numbers in period II. This trend is consistent with the results of the word-cloud analysis for period III (Fig. 3D). Although not to the same extent as for groups E and L, the number of publications in group C (β-lactam antibiotics) increased in period III. Increases in the publication counts of groups E (vancomycin) and C (β-lactam antibiotics) indicate that efforts have been focused on developing TDM for antibiotics. Period III also witnessed a decrease in publications for group J (HIV). Publication counts from PubMed showed a peak in period II and a decrease thereafter (Supplementary Fig. S4A). Thus, it can be inferred that most research efforts on TDM for anti-HIV drugs had been concentrated mainly in period II.
In period IV, groups E (vancomycin) and L (LBD) remained two of the largest in terms of the number of publications, which was consistent with the word-cloud analysis (Fig. 3D, period IV).
On the other hand, the small numbers of publications in group S (cancer) across all periods were not consistent with the result shown in Fig. 2D, in which antineoplastic agents ranked third in the number of publications. To investigate this, we performed a PubMed search and found that TDM studies in the field of cancer treatment research have become more frequent since period III (Supplementary Fig. S4B). A closer look at the word clouds for periods III and IV showed that terms related to cancer (e.g., “cancer,” “asparaginase,” and “imatinib”) were featured in groups F (polymorphism) and H (antifungal agents) (Supplementary Figs. S3F, S3H). This indicates that in periods III and IV, a substantial fraction of TDM studies in the field of cancer treatment research were linked with polymorphism or treatment of fungi, thereby dispersing TDM-related publications relevant to cancer treatment across several groups.
Altogether, we propose a historical overview of TDM research in Fig. 5.
Major themes in each period and expected topics in next few years are described.
Despite the remarkable progress in TDM practice, there is insufficient information on how the field of TDM research has developed over the years. One recent study summarized an overview of TDM practices.25) However, that study was quite biased and not comprehensive because the drugs included therein were limited to commonly used ones. To obtain an unbiased and comprehensive overview of TDM research trends, we applied a series of text-mining approaches.10–12) To our knowledge, this is the first study to employ text-mining techniques to provide a historical overview of TDM research trends.
The number of publications related to TDM increased steadily (Fig. 1). This indicates an upward trend in TDM research activity. By applying an automated system to perform PubMed searches, we found that according to the number of publications, tacrolimus was the most studied drug in TDM-related research (Fig. 2B). The top-ranked drugs were all drugs for TDM, validating the PubMed search term. In terms of drug classes, immunosuppressants were the most studied class, followed by antibiotics and antineoplastic agents (Fig. 2D). The result shown in Fig. 2D consists mainly of the drug classes of the top-ranked drugs shown in Fig. 2B, indicating that the TDM research field is constituted by only a small portion of all drugs. Indeed, publications with counts ≥100 (the major group) represented less than 1% of the drugs in KEGG DRUG but constituted 39.0% of TDM-related publications (Fig. 2C and Supplementary Table S1). Together, we conclude that only a small number of drugs have been the focus of most TDM studies.
Subsequently, we adopted well-established text-mining techniques to determine the evolution of the TDM research field over time. The result of word-cloud analysis and LDA identified distinct topics in each period, enabling us to understand when the top-ranked drugs shown in Fig. 2B were extensively studied.
The main conclusion of this study is a historical overview of TDM research, which is summarized in Fig. 5. In period I (before 2003), TDM research focused on theophylline, anticonvulsants, immunosuppressants (mainly cyclosporine), and TDM methods. The first two topics showed downward trends, whereas the last two topics remained important across all periods. In period II (2003–2013), TDM studies on immunosuppressants were focused on mycophenolic acid and tacrolimus, and on cyclosporine. Period II also witnessed increases in the number of publications on drugs classified as antifungal agents (group H) and for the treatment of HIV (group J), although the absolute numbers of publications were not as large as that for immunosuppressants. In period III (2014–2019), TDM studies on HIV treatment became less frequent, whereas those on antifungal agents remained constant in number. With regard to immunosuppressants, publications on cyclosporine and mycophenolic acid decreased in number, and studies focused mainly on tacrolimus. This reflects the accumulating evidence of tacrolimus as a first-line calcineurin inhibitor.26,27) On the other hand, publications on IBD, antibiotics, and antineoplastic agents increased in number and became one of the major topics of TDM-related research, a trend that continued to period IV (2020–2022). Word-cloud representations of group L (IBD) featured biopharmaceuticals (e.g., infliximab and adalimumab, Supplementary Fig. S3L). This reflects the TDM practice in biologic therapies since 2010s, although the practice is not common in several countries including Japan.4,6,28,29) Terms relevant to rheumatoid arthritis (RA) were not as featured as IBD, despite the use of infliximab and adalimumab in the treatment of RA (Fig. 3D, Supplementary Fig. S3L). This indicates that TDM studies on infliximab and adalimumab have been conducted mainly in the context of IBD treatment. TDM studies on antibiotics focused mainly on vancomycin, because the publication counts of group E (vancomycin) were larger than those of group C (β-lactam antibiotics). For group C, a plausible reason for the increase in the number of publications is the need for optimization of antibiotic use to combat the global concern of antimicrobial resistance, which was raised by The Review on Antimicrobial Resistance in 2014.30) The increase in the number of TDM studies on vancomycin presumably resulted from the first consensus guidelines for vancomycin TDM published in 2009, which led to research studies on evaluating these guidelines.31–34) For antineoplastic agents, asparaginase (group F), imatinib (group H), methotrexate (group S), and 5-fluorouracil (group S) were featured in the word cloud (Supplementary Figs. S3F, S3H, S3S, periods III and IV). The first three drugs are used for the treatment of leukemia, indicating that among recent TDM-related studies in the field of cancer treatment research, many have focused on the treatment of leukemia.
In addition to the longitudinal trends in TDM research, our analysis also shed light on future research trends. We infer that TDM studies on IBD, antibiotics (mainly vancomycin), and antineoplastic agents, which have been major topics since period III, will be placed at the center of TDM research in the next few years, because publication counts in PubMed have shown upward trends (Fig. 3E and Supplementary Figs. S1 and S4B). Conversely, publication counts for tacrolimus and cyclosporine, which are the most studied drugs at present (Fig. 2B), have shown a moderate increase and remained constant, respectively (Fig. 3E). Based on this trend, taken together with the large numbers of TDM-related publications regarding vancomycin at present, the currently top-ranked drugs (tacrolimus and cyclosporine) will eventually be replaced by vancomycin. Meanwhile, TDM studies on antifungal agents have had relatively large numbers of publications since period II. The word frequencies of newly approved drugs, posaconazole and isavuconazole, have increased steadily since period II and became the most frequent words in period IV (Supplementary Fig. S3H). This indicates that the TDM of these new drugs has garnered interest. The emergence of posaconazole and isavuconazole implies upward trends in the TDM of antifungal agents, and thus, we speculate that this class of drugs could potentially become a major topic of research.
The limitations of this study are as follows. First, we retrieved information on prescription drugs from the KEGG DRUG Database. However, as with other drug databases, KEGG DRUG is not comprehensive. Although we annotated drug classes based on the records in KEGG DRUG, these records were sometimes inconsistent with those used in clinical settings. For instance, during the preparation of this manuscript, thalidomide (D00754) was recorded as an antibiotic and antineoplastic agent, although it is not used as an antibiotic in clinical settings.35,36) Second, we analyzed only the publications recorded in PubMed. Evaluating publications recorded in other repositories, such as Google Scholar (https://scholar.google.com) and Web of Science (https://clarivate.com/webofsciencegroup/solutions/web-of-science/), will provide a more accurate and comprehensive historical overview of TDM research. It should also be noted that gray literature and unpublished studies were not addressed in this study, which may have introduced some bias. Third, we queried TDM-related publications based on MeSH indexing, and therefore, this study excluded publications that were not assigned MeSH terms. Indeed, only 66.9% (9281/13868) of TDM-related publications were recovered by the drug name search, indicating that a large number of publications were not assigned MeSH terms for the drug names (Fig. 2C).
Collectively, we used text-mining tools to provide an overview of how the TDM research field evolved over time, thereby serving as a foundation for future studies.
This study was supported by JSPS KAKENHI Grants JP24K17995 (T.M.), JP23H02669 (K.Y.), and JP22K19749 (H.M.), Chukyo Longevity Medical Research and Promotion Foundation (T.M.), and Morinomiyako Medical Research Foundation (T.M.).
T.M. designed the project, developed the modules, analyzed and interpreted the data, prepared the figures, and wrote the manuscript. H.M. and K.Y. analyzed and interpreted the data and wrote the manuscript.
The authors declare no conflict of interest.
This article contains supplementary materials. The following files are available free of charge: Supplementary Tables S1–S5 and Supplementary Figs. S1–S4. The dataset and code for the modules used to derive the results in this study are available at https://github.com/Matsuzaki-T/TDM_text-mining.