2022 年 258 巻 2 号 p. 79-90
The current work screened differentially expressed genes (DEGs) related to advanced clear cell renal cell carcinoma (ccRCC) and found potential biomarkers and drugs for advanced ccRCC. After analyzing GSE53757 and GSE66271, we identified DEGs and performed the functional annotation, pathway enrichment, validation, survival analysis, and candidate drug analysis. We obtained 861 common DEGs from datasets between advanced ccRCC tissues and normal kidney tissues. Besides, we performed functional analysis under ontological conditions and carried out pathway analysis. The five most stable core gene groups and top 10 genes were screened using the Cytoscape software. We performed functional and pathway analyses again and found that the core genes were similar to total DEGs. After verification, the expression trends of the 10 hub genes did not change. Survival analysis showed high expressions of TOP2A, BIRC5, BUB1, MELK, RRM2, and TPX2 genes, suggesting that they might participate in cancer occurrence, migration, and relapse of ccRCC. The gene-drug analysis showed that gallium nitrate, cladribine, and amonafide were strongly associated with RRM2 and TOP2A. We found that RRM2 and TOP2A might be predictive biomarkers and novel targeted therapy for advanced ccRCC. These drugs (gallium nitrate, cladribine, and amonafide) might be used for treating advanced ccRCC.
Kidney cancer comprises around 3% of overall cancer cases, and its morbidity has increased by 2% annually worldwide for the past 20 years (Ferlay et al. 2018). Renal cell carcinoma (RCC) refers to a normal renal parenchymal lesion and is responsible for approximately 90% of all kidney cancer cases. Clear cell renal cell carcinoma (ccRCC) is a primary renal cell carcinoma subtype comprising approximately 80% of all RCC. Compared to other cancer subtypes, ccRCC has higher tumor recurrence and metastasis rates (Dong et al. 2019; Ljungberg et al. 2019). Since ccRCC does not have sensitivity to radiotherapy, chemotherapy, and immunotherapy, surgery is the primary treatment (Zhang et al. 2021). In the last decade, several studies identified diagnostic markers for ccRCC. However, as the mechanism of molecular regulation is not clear, effective drugs have not been administered clinically. Thus, it is necessary to detect biomarkers and drugs to treat ccRCC.
The Gene Expression Omnibus (GEO) database functions as a comprehensive database for collecting tumor-related data. It contains sequencing results from all over the world. Recent studies have mined biomarkers of lung cancer (Long et al. 2020), breast cancer (Li et al. 2018), gastric cancer (Cao et al. 2018), pancreatic cancer (Ren et al. 2021) as well as other tumors through bioinformatics analysis of the GEO database. These biomarkers might contribute to diagnosing and treating diseases. In the current work, we searched for advanced ccRCC along with matched non-carcinoma renal tissues using the GEO database. We then mined differentially expressed genes (DEGs) for Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment, gene-drug analysis, and protein-protein interaction (PPI) analysis for identifying possible biomarkers and therapeutic agents for advanced ccRCC.
GEO is a publicly available oncogenomic database, and accessing it does not require ethics committee approval (Barrett et al. 2013). We obtained the advanced ccRCC datasets GSE53757 and GSE66271 from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array) was used as a platform for microarray datasets. The GSE53757 dataset has 60 advanced ccRCC samples and 60 healthy renal samples, while the GSE66271 dataset has 12 advanced ccRCC samples and 13 healthy renal samples (Table 1).
A summary of microarray datasets from Gene Expression Omnibus (GEO) datasets.
The DEGs in advanced ccRCC were compared to normal tissues from the GSE53757 and GSE66271 datasets and analyzed by adopting the GE02R online tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/). P < 0.05 and |log FC| > 2 were regarded to be statistically significant. A log FC value > 2 indicated that the genes were upregulated, whereas a log FC value < −2 indicated that the genes were downregulated. The DEGs were shown using the Hiplot online drawing website (https://hiplot.com.cn/). We identified the genes that were common to the GSE53757 and GSE66271 datasets. The Venn diagram was used to visualize the intersecting genes (http://bioinformatics.psb.ugent.be/webtools/Venn/) (Jia et al. 2021).
Functional annotations for DEGsWe imported the selected DEGs into the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (https://david.ncifcrf.gov/) (Dennis et al. 2003), which is an online database for GO as well as the KEGG pathway enrichment for DEGs, with the threshold of P < 0.05.
Establishing the protein-protein interaction (PPI) networkWe constructed the PPI network on the basis of the STRING online database (https://string-db.org/) (Szklarczyk et al. 2021), and the Cytoscape software (http://www.cytoscape.org/) was adopted for visualization. Based on the cytoHubba plug-in, we selected the top 10 hub genes from the PPI network. Modules with the highest significance throughout the PPI network were constructed with the MCODE plug-in. We visualized the intersection of the top 10 genes and the five most stable core genes with a Venn diagram.
Functional enrichment analyses of the core genesTo conduct GO and KEGG pathway enrichment, we uploaded the DEGs of the five core genes to the DAVID database. Besides, the cut-off was set as P < 0.05.
Hub gene levels and prognostic valueWe imported the hub genes in the GEPIA2 database (http://gepia2.cancer-pku.cn/) for calculating the prognostic index (Tang et al. 2019). Gene expression profiles in cancer tissues and normal tissues can be studied using this database. Apart from that, the Kaplan-Meier (KM) curve was also plotted with the purpose of evaluating the prognostic outcome for advanced ccRCC cases based on the hub genes.
Establishing gene-drug interactionsThe DGIdb database (https://dgidb.org/) was adopted for obtaining the drug-gene interaction data (Freshour et al. 2021). We imported the detected hub genes to this database using GEPIA2 for selecting the compounds or drugs with the filter criteria supported by previous studies and a relationship score > 2.0.
The GSE53757 and GSE66271 datasets were selected according to the filter conditions. GSE53757 has 1,334 DEGs, where 538 were upregulated and 796 were downregulated; GSE66271 has 1,438 DEGs, among which 796 presented upregulation, and 796 showed downregulation (Fig. 1). A Venn diagram was constructed to visualize the intersection of 861 common DEGs, among which 287 were upregulated with 574 being downregulated (Table 2, Fig. 2).
Volcano plot of differentially expressed genes between advanced clear cell renal cell carcinoma (ccRCC) tissues and normal kidney tissues in datasets GSE53757 and GSE66271.
Red denotes genes with high expression in tumor tissues, and blue stands for low expression in tumor tissues. (A) GSE53757; (B) GSE66271.
The common differentially expressed genes (DEGs).
Venn diagram to intersect differentially expressed genes (DEGs) of advanced ccRCC.
(A) Common up-regulated DEGs, totally 287. (B) Common down-regulated DEGs, a total of 574.
Concerning the GO-Biological Process (BP) terms, the DEGs were mostly associated with “cell adhesion”, “signal transduction”, “immune response”, “inflammatory response”, “positive cell proliferation regulation”, “neutrophil degranulation”, “negative apoptotic process regulation”, “xenobiotic stimulus response”, “proteolysis”, and “angiogenesis”. In the Cellular Components (CC) annotation, these DEGs were located in the “extracellular exosome”, “extracellular region”, “plasma membrane”, “apical plasma membrane”, “basolateral plasma membrane”, “integral plasma membrane component”, “membrane”, “extracellular space”, “cell surface”, “integral membrane component”, and “external side of the plasma membrane”. In Molecular Function (MF), “oxidoreductase activity”, “protein homodimerization activity”, “protein binding”, “zinc ion binding”, “identical protein binding”, “receptor binding”, “calcium ion binding”, “macromolecular complex binding”, “transmembrane transporter activity” and “heparin binding” were clustered. Regarding the KEGG analysis, the DEGs were closely related to the pathways of “focal adhesion”, “cytokine-cytokine receptor interaction”, “metabolic pathway”, “phagosome”, the “PPAR pathway”, “cell adhesion molecules”, the “chemokine pathway”, “complement and coagulation cascade”, “carbon metabolism”, and the “interaction between the viral protein and cytokine/cytokine receptor”. The top 20 DEGs associated with the BP, CC, and MF enrichment results and the top 10 DEGs associated with the KEGG pathway were sorted by counts and are shown in Fig. 3.
The Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of DEGs bubble plots.
The results of top 20 GO annotation and the top 10 KEGG analysis were presented. (A) Biological process; (B) Cellular component; (C) Molecular function; (D) Signal pathway.
The Cytoscape software was adopted for establishing a PPI network for DEGs from the STRING database. By analyzing the PPI network, the 10 most significant hub genes (ASPM, TPX2, DLGAP, NCAPG, MCM10, BIRC5, BUB1, MELK, RRM2, and TOP2A) of advanced ccRCC were detected with the Cytoscape plug-in cytoHubba. Five modules with the highest significance were acquired using the MCODE plug-in. The default conditions were used for all parameters (Fig. 4).
Module analysis from the protein-protein interaction (PPI) network.
By adopting Cytoscape software, the top 10 genes and the five most stable core gene groups in the network were screened from the protein-protein network. Red signifies up-regulated genes, and blue stands for down-regulated genes. A Venn diagram was adopted for visualizing the top 10 genes contained in the five core gene groups. (A) PPI enrichment map, with 717 nodes and 8,110 edges; (B) Top 10 genes, with ten nodes and 45 edges; (C) Core gene cluster 1, with 27 nodes and 676 edges; (D) Core gene group 2, with 21 nodes, 234 edges; (E) Core gene group 3, with 14 nodes, 92 edges; (F) Core gene group 4, with 46 nodes, 296 edges; (G) Core gene group 1, with 29 nodes and 168 edges; (H) Venn diagram to visualize the common genes of top 10 and clusters. (B) is the top 10 genes which gets from the cytoHubba plug-in, while (C-G) are the core gene groups which get from the MCODE plug-in of the Cytoscape.
For the GO-BP terms, the five core genes were mostly associated with “cell division”, “chromosome segregation”, “cell cycle”, “apoptosis process”, “protein phosphorylation”, “cell proliferation”, “mitotic cytokinesis”, “DNA repair”, “DNA damage stimulus response of cells”, “immune response”, “signaling”, and “inflammatory reaction”. In the CC annotation, these genes were located in the “nucleus”, “nucleoplasm”, “cytosol”, “membrane”, “midbody”, “kinetochore”, “microtubule”, “chromosome”, the “centromeric region”, “centrosome”, “plasma membrane”, “extracellular region”, and the “cell surface”.
In the MF terms, “protein homodimerization activity”, “protein kinase binding”, “ATP-dependent microtubule motor activity, plus-end-directed”, “protein binding”, “microtubule binding”, “ATP binding”, “microtubule motor activity”, “identical protein binding”, “chemokine activity”, “receptor binding”, “heparin binding” as well as “transmembrane signaling receptor activity” were clustered.
Regarding the KEGG analysis, “metabolic pathways”, the “Toll-like receptor pathway”, the “chemokine pathway”, the “Rap1 pathway”, the “PI3K-Akt pathway”, the “interaction between cytokine and cytokine receptor”, the “interaction between the viral protein and cytokine/cytokine receptor”, “focal adhesion”, “Coronavirus disease-COVID-19”, the “HIF-1 pathway”, “Human papillomavirus infection”, and the “Natural killer cell-regulated cytotoxicity” pathways were closely related to these core genes. The top 10 GO annotations and KEGG pathway enrichment were sorted by counts and are shown in Fig. 5.
Two-coordinate graph of Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of differentially expressed genes (DEGs) in each core gene group.
The light blue represents core gene group 1; the orange represents core gene group 2; the red represents core gene group 3; the green represents core gene group 4; the dark blue represents core gene group 5; and the yellow curve represents −log10 (P-value). (A) Biological process; (B) Cellular component; (C) Molecular function; (D) Signal pathway.
We analyzed the survival relevance of the amplified genes to further identify the prognostically relevant genes that might inhibit advanced ccRCC. By using the GEPIA database, we analyzed how hub genes affect the survival of ccRCC. The BIRC5, BUB1, MELK, RRM2, TOP2A, and TPX2 levels showed obvious association with the overall survival of the patients (Fig. 6). Apart from that, the overall survival rate of ccRCC patients was reduced considerably when hub genes were upregulated (P < 0.05) (Fig. 7).
Boxplots for validating the hub gene for clinical data validation of tumor and normal kidney tissue from advanced ccRCC patients.
*P < 0.05. Red represents advanced ccRCC tissue, and gray represents normal kidney tissue.
The overall survival curve of the hub gene was drawn with the use of the GEPIA platform.
P < 0.05 was regarded to show statistical significance. The survival curve indicated that in advanced ccRCC patients, patients with high expression of up-regulated hub genes had a shorter survival time. Red stands for high gene expression with blue representing low gene expression. (A) BIRC5, (B) BUB1, (C) MELK, (D) RRM2, (E) TOP2A, (F) TPX2.
In order to investigate the correlation between genes and drugs, the six verified hub genes were uploaded to the DGIdb database. We only identified RRM2 and TOP2A and matched them with three estimated therapeutic agents (gallium nitrate, cladribine, and amonafide) (Table 3). The filter criteria included a relationship score > 2.0 and being supported by previous studies (Table 4).
The effective drugs targeted hub genes.
N/A, Not Applicable.
Publications associated with the effective drugs targeted hub genes.
RCC mainly includes ccRCC, chromophobe RCC, and papillary RCC (Ljungberg et al. 2019). CcRCC belongs to one of the deadliest subtypes of urinary malignancy and has the highest tumor metastasis rate, recurrence rate, and mortality rate among histological subtypes of RCC (Patard et al. 2005). It is found that the prognosis of advanced ccRCC is poor with its five-year survival rate being as low as 11.7% (Siegel et al. 2017). Radiotherapy or chemotherapy has no significant effect on ccRCC patients; the most effective treatment method is surgical resection (Makhov et al. 2018). However, postoperative metastasis in patients with ccRCC is as high as 30%. The survival rate of patients with metastasis or preoperative metastasis is low, and the average survival time is less than one year (Hsieh et al. 2017). Therefore, elucidating the pathogenesis of highly advanced ccRCC is necessary, as it might help to develop clinical therapeutic strategies for ccRCC and identify the potential molecular targets for ccRCC-targeted drugs. Based on the quick progress of bioinformatics, a lot of microarray and sequencing data are available, which can be adopted for diagnosing and identifying treatment targets for various diseases (Batai et al. 2018; Martinez-Romero et al. 2018). Bioinformatics methods have been recently used to conduct secondary analysis of ccRCC-related data to identify the candidate genes of ccRCC (Yuan et al. 2018). In the current work, we explored gene-drug prediction in more detail to identify potential drugs that might be used for treating advanced ccRCC and provide new ideas for other studies.
Two datasets, GSE53757 and GSE66271, were analyzed online through GEO2R to identify DEGs in advanced ccRCC compared to healthy renal tissues. Then, we used various bioinformatics analysis methods to analyze the DEGs for a better understanding of these genes based on GO annotation, the KEGG enrichment analysis, and the PPI network analysis. Then, we performed verification, gene-drug interaction, and survival analyses on the hub genes. Our study was the first one to find three drugs (gallium nitrate, cladribine, and amonafide) associated with advanced ccRCC through bioinformatics methods. Our findings might provide new treatment strategies for advanced ccRCC patients.
In this study, we obtained 861 DEGs (287 were upregulated and 574 were downregulated). In BP annotation, the DEGs were primarily associated with “cell adhesion”, “signaling”, “inflammatory reaction”, “immune response”, “positive cell proliferation regulation”, “neutrophil degranulation”, “negative apoptotic process regulation”, and “xenobiotic stimulus response”. Regarding the CC terms, the DEGs were mostly associated with “basolateral plasma membrane”, “apical plasma membrane”, “external side of the plasma membrane”, “cell surface”, “membrane component”, “plasma membrane”, “membrane”, and “plasma membrane component”. In MF annotation, these DEGs were predominantly related to “identical protein binding”, “protein binding”, “calcium ion binding”, “zinc ion binding”, “receptor binding”, “heparin binding”, and “macromolecule complex binding”. As revealed by the KEGG enrichment analysis, the common DEGs were predominantly concentrated in “metabolic pathway”, “PPAR signaling pathway”, “carbon metabolism”, “chemokine pathway”, “interaction of cytokine with cytokine receptor”, and “interaction between the viral protein and cytokine/cytokine receptor”.
Based on the PPI network, five core gene groups with the most stable structures were extracted. Then, we identified the hub genes that might be essential drug targets for advanced ccRCC and used Venn diagrams to visualize the top 10 common genes and gene clusters. We found that the 10 most significant genes were the hub genes. Thus, we analyzed the genes of the core groups with GO and KEGG. These core genes were mostly associated with the immune and inflammatory responses of the biological process. The upregulated DEGs of ccRCC were primarily involved in inflammatory and immune responses (Zhang et al. 2019; Xu et al. 2020). Regarding the cellular components, the DEGs were mainly associated with the membrane; this was also found by Wang et al. (2020). The upregulated genes of ccRCC are mainly located on the plasma membrane (Quan et al. 2021). Chen et al. (2020) concluded that the DEGs of ccRCC are also enriched in the nucleus (“nucleus” and “kinetochore”), and this observation was similar to the findings of our study. Similar to the results of Quan et al. (2021), we found that the molecular functions of the DEGs in advanced ccRCC were associated with the binding function of proteins and metal ions. Several studies have also verified this finding (Wang et al. 2020). The KEGG pathway of the core genes included the “PI3K-Akt pathway”, the “interaction of cytokine with cytokine receptor”, and “focal adhesions”. In some studies, the above three pathways were shown to be related to and have an important effect on ccRCC progression (Wang et al. 2019; Zhou et al. 2019).
The hub genes were introduced into GEPIA2 for further verification. Besides, the expression of the 10 hub genes remained unchanged after confirmation by the TCGA database. The differential expression of six genes (BIRC5, BUB1, MELK, RRM2, TOP2A, and TPX2) was statistically significant between the ccRCC samples and standard kidney samples. Moreover, according to survival analysis, upregulation of these hub genes would lower the overall survival rate of patients, suggesting a close association of such hub genes with the progression of ccRCC. Gene-drug interaction analysis was performed on the six verified hub genes. Only three small-molecule drugs (gallium nitrate, cladribine, and amonafide) were found to be associated with RRM2 and TOP2A.
The RRM2 protein belongs to one of the subunits of the nucleotide reductase complex that catalyzes the formation of deoxynucleotides and exerts a vital function in DNA synthesis. It is associated with tumor growth, angiogenesis, invasion, metastasis, and patient prognosis (Chen et al. 2019). Thus, it is the efficient anti-tumor target enzyme. RRM2 upregulation is tightly related to the genesis and development of different types of cancer, like gastric cancer (Kang et al. 2014), colorectal cancer (Hsieh et al. 2016), non-small cell lung cancer (Mah et al. 2015), and nasopharyngeal cancer (Han et al. 2015). Moreover, in the genitourinary system, studies have also demonstrated that upregulation of RRM2 may reduce bladder cancer (Morikawa et al. 2010) and adrenocortical cancer (Grolmusz et al. 2016) survival. The findings of the current work suggested that RRM2 can be applied as a potential biomarker of advanced ccRCC. Some studies also found that the expression of RRM2 associates with Fuhrman grading and the pathological stage (Zou et al. 2019). On the contrary, knockout of the RRM2 gene leads to ccRCC cell line arrest, thereby inhibiting tumor growth (Osako et al. 2019).
TOP2A has an important effect on the topological state of DNA during replication and transcription. The occurrence of cancer is inseparable from the process of DNA metabolism. Being an enzyme that plays an irreplaceable role in gene expression, TOP2A is closely related to diverse categories of cancer including pancreatic cancer (Pei et al. 2018) and gastric cancer (Terashima et al. 2017). In cell carcinoma, TOP2A is associated with renal papillary carcinoma (Ye et al. 2018) and renal clear cell carcinoma (Zhang et al. 2019). The expression of TOP2A in renal clear cell carcinoma relates to a pathological stage that might result in the progression of renal cell carcinoma (Chen et al. 2018). Thus, TOP2A might become a new prognostic marker for RCC.
Anti-tumor drugs act in various ways, but the cornerstone of cancer therapy is DNA-targeting. Through gene-drug interaction analysis, gallium nitrate and cladribine were identified as inhibitors of RRM2. Amonafide is related to TOP2A, but the relationship is not clear. Gallium is a metal that is pharmacologically similar to iron. It can inhibit ribonucleotide reductase by replacing iron in its M2 subunit, leading to the loss of a tyrosyl radical, thereby inhibiting ribonucleotide reductase activity, DNA synthesis, and tumor growth (Narasimhan et al. 1992). Gallium nitrate can inhibit RRM2 and, thus, might be a potential drug for advanced ccRCC (Chitambar 2004). Cladribine refers to a synthetic purine nucleoside analog that promotes lymphocyte depletion mainly by continuously reducing B lymphocytes (Jacobs et al. 2018; Pfeuffer et al. 2022). Cladribine was initially used for hematological diseases (Beutler 1992). It can also be administered for treating other conditions, such as multiple sclerosis (Giovannoni et al. 2018) and cervical cancer (Yi et al. 2019). Nevertheless, no study has investigated the effect of cladribine on ccRCC. Cladribine might have an inhibitory effect on the RRM2 gene, and thus, it might inhibit advanced ccRCC (Zhenchuk et al. 2009; Cao et al. 2013). Amonafide induces apoptosis by intercalating DNA and blocking the binding of Topo II to DNA (Allen and Lundberg 2011). It can treat acute myeloid leukemia (Freeman et al. 2012) and breast cancer (Costanza et al. 1995). It is found that the mechanism of action of amonafide shows relationship to TOP2A (Quintana-Espinoza et al. 2013; Tan et al. 2015). Although the correlation is unclear, it might be a potential targeted drug for ccRCC.
This study had some limitations. First, bioinformatics analyses were conducted without experimental verification. Second, there was a specific risk of bias. Finally, the screened hub genes and drugs need to be verified empirically.
In conclusion, through bioinformatics analysis, we found that RRM2 and TOP2A have an important effect on the genesis, progression and prognostic outcome of advanced ccRCC. Thus, they might serve as novel diagnostic and therapeutic biomarkers. Three identified drugs (gallium nitrate, cladribine, and amonafide) might be administered for the treatment of advanced ccRCC.
We appreciate for all the public databases and websites applied in the present study. This study was supported by the Fujian Provincial Nature and Science Foundation (Grant No. 2020J011206).
The authors declare no conflict of interest.