Genes & Genetic Systems
Online ISSN : 1880-5779
Print ISSN : 1341-7568
ISSN-L : 1341-7568
Full papers
Analysis for drug metabolism-related prognostic subtypes and gene signature in liver cancer
Yue ZhangJun ChenChengru HuXiangzhong HuangYan Li
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML
Supplementary material

2022 Volume 97 Issue 6 Pages 271-284

Details
ABSTRACT

Liver cancer is highly heterogeneous and has a poor prognosis. We aimed to identify a drug metabolism-related prognostic subtype and a gene signature as references for prognosis and therapy options for patients with liver cancer. Patient information was collected from online databases. Drug metabolism-related genes were obtained from previous studies and were used to screen differentially expressed prognostic genes. The patients were divided into different clusters and differences in clinical features, immunity, pathways and therapy responses between the clusters were analyzed. LASSO analysis was performed to identify the optimal prognostic genes and establish a risk score model. Finally, the risk score distribution in different subtypes was investigated. A total of 54 prognostic genes were identified to categorize the patients into cluster 1 and cluster 2. Cluster 1 showed worse survival than cluster 2, and cluster 1 also showed high levels of malignancy. Furthermore, cluster 1 exhibited a higher TIDE (tumor immune dysfunction and exclusion) score and lower IC50 response to paclitaxel, gemcitabine and camptothecin, indicating that cluster 1 individuals may derive more benefit from immunotherapy but less benefit from chemotherapy. The risk score, based on the six optimal prognostic genes, demonstrated an adequate prognostic capability. The high-risk group showed worse survival; meanwhile, cluster 1 contained the majority of high-risk samples. Our results should be useful for prognosis and specific therapy for patients with liver cancer. Patients with the features of cluster 1 and a high risk score will tend to exhibit worse survival. Furthermore, immunotherapy may be more suitable for cluster 1-type patients while chemotherapy may be more suitable for cluster 2 patients.

INTRODUCTION

Liver cancer is the sixth most common malignancy, and accounts for 8.3% cancer-related deaths, in the USA (Siegel et al., 2020). It is well known that liver cancer is highly heterogeneous, and a number of molecular subtypes based on gene mutation and other clinical factors have been proposed (Boyault et al., 2007; Hoshida et al., 2009; Yang et al., 2020). Based on these subtypes, differences are apparent in prognosis and sensitivity to immunotherapy and chemotherapy, which may be useful for clinicians to develop specific therapy strategies for patients with specific subtypes.

Metabolism has received increasing attention in recent years. Metabolic reprogramming is a hallmark of malignancy (Faubert et al., 2020). Tumor cells ingest glucose through the Warburg effect, which is considered to be a universal metabolic alteration in carcinogenesis (Vaupel et al., 2019). The Warburg effect provides energy for tumor cells and accelerates the accumulation of glycolytic intermediates that are required for cancer biomass synthesis. Furthermore, in the tumor microenvironment, development of tumor cells suppresses the function of immune cells through competitive uptake of nutrients. Moreover, metabolites can further suppress the function of immune cells, resulting in immune escape (Siska et al., 2020). Lactate accumulation resulting from the Warburg effect drives not only tumor progression but also resistance to certain antitumor therapies (Vaupel et al., 2019). These observations demonstrate that metabolism plays a pivotal role in tumor development and treatment. In addition, metabolic abnormalities in the liver, which is the largest metabolic organ in the human body, are tightly correlated with the occurrence, development and therapeutic drug efficacy of liver cancer. Compared to normal tissue, there is high metabolic heterogeneity in tumor tissue (Faubert et al., 2020), and it is therefore urgent to identify patients with specific drug metabolic characteristics for development of targeted and effective treatment strategies.

A group of genes are proposed to be involved in drug absorption, distribution, metabolism and excretion (ADME) in various cancers (Hu et al., 2020). The aims of the present study were to identify an effective prognostic subtype that is related to drug metabolism based on these ADME genes, and to further investigate the prognostic signature for patients with liver cancer.

RESULTS

The flowchart of this study is shown in Fig. 1.

Fig. 1.

The study design. ADME: drug absorption, distribution, metabolism and excretion.

Analyses for drug metabolism genes

A total of 122 differentially expressed genes were selected (Fig. 2A) from the 239 drug metabolism-related genes listed in Supplementary Table S1 (see Materials and Methods). Further analyses showed that these 122 differential drug metabolism genes were enriched in 194 BP (biological process), 5 CC (cellular component) and 79 MF (molecular function) Gene Ontology (GO) terms, and 22 KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. The top 20 significant GO terms and all the KEGG terms are shown in Fig. 2B–2E. These results verified that these genes were closely associated with carcinogenesis and various metabolic processes such as drug metabolism, fatty acid metabolism, biosynthesis, antifolate resistance and arachidonic acid metabolism. The 122 genes were closely related to metabolism-related processes and pathways. Construction of a protein–protein interaction (PPI) network revealed 46 genes and 206 relationship pairs (Fig. 2F).

Fig. 2.

Analysis for drug metabolism-related genes. (A) Volcano plot for identification of differentially expressed drug metabolism genes. (B) Gene Ontology (GO)-biological process (BP) analysis for differentially expressed drug metabolism genes (TOP20). (C) GO-cellular component (CC) analysis for differentially expressed drug metabolism genes. (D) GO-molecular function (MF) analysis for differentially expressed drug metabolism genes (TOP20). (E) Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses for differentially expressed drug metabolism genes. (F) Protein–protein interaction (PPI) network analysis for differentially expressed drug metabolism genes. The depth of color indicates the size of the difference; red, upregulated; blue, downregulated.

Identification of prognostic drug metabolism genes and subtypes

A total of 54 genes with P < 0.05 were selected as prognostic genes through univariate Cox analysis. With these 54 prognostic genes, consistent clustering analysis was conducted and the tumor samples were divided into two subtypes: cluster 1 and cluster 2 (Fig. 3A). Kaplan–Meier analysis showed that cluster 1 exhibited a worse prognosis (Fig. 3B).

Fig. 3.

Classification of subtypes and survival analysis. (A) All the samples are divided into cluster 1 and cluster 2. (B) Cluster 1 shows worse survival.

Clinical differences between the two clusters

Next, the clinical features of the two clusters were compared. Significant differences between them showed in age, gender, grade, neoplasm histologic grade, pathologic T, pathologic stage and vascular tumor cell type (Table 1). Interestingly, cluster 1 showed higher levels of malignancy.

Table 1. Differences in clinical features between the two clusters
CharacteristicCluster 1
(n = 195)
Cluster 2
(n = 168)
Total
(n = 363)
P-valueFDR
Age0.00340.02
< 60103 (28.37%)62 (17.08%)165 (45.45%)
≥ 6092 (25.34%)106 (29.20%)198 (54.55%)
Gender4.70E-053.80E-04
Female82 (22.59%)36 (9.92%)118 (32.51%)
Male113 (31.13%)132 (36.36%)245 (67.49%)
Neoplasm histologic grade1.70E-061.50E-05
G118 (4.96%)37 (10.19%)55 (15.15%)
G283 (22.87%)92 (25.34%)175 (48.21%)
G383 (22.87%)33 (9.09%)116 (31.96%)
G49 (2.48%)3 (0.83%)12 (3.31%)
NA2 (0.55%)3 (0.83%)5 (1.38%)
Pathologic M0.740.74
M0156 (42.98%)106 (29.20%)262 (72.18%)
M11 (0.28%)2 (0.55%)3 (0.83%)
NA38 (10.47%)60 (16.53%)98 (27.00%)
Pathologic N0.250.49
N0144 (39.67%)102 (28.10%)246 (67.77%)
N14 (1.10%)0 (0%)4 (1.10%)
NA47 (12.95%)66 (18.18%)113 (31.13%)
Pathologic T0.00160.0097
T180 (22.04%)100 (27.55%)180 (49.59%)
T254 (14.88%)37 (10.19%)91 (25.07%)
T352 (14.33%)24 (6.61%)76 (20.94%)
T49 (2.48%)4 (1.10%)13 (3.58%)
NA0 (0%)3 (0.83%)3 (0.83%)
Pathologic stage0.000470.0033
Stage I75 (20.66%)95 (26.17%)170 (46.83%)
Stage II50 (13.77%)34 (9.37%)84 (23.14%)
Stage III58 (15.98%)23 (6.34%)81 (22.31%)
Stage IV2 (0.55%)2 (0.55%)4 (1.10%)
NA10 (2.75%)14 (3.86%)24 (6.61%)
Vascular tumor cell type0.00580.02
Macro12 (3.31%)2 (0.55%)14 (3.86%)
Micro53 (14.60%)37 (10.19%)90 (24.79%)
None96 (26.45%)109 (30.03%)205 (56.47%)
NA34 (9.37%)20 (5.51%)54 (14.88%)
Recurrence0.140.41
No87 (23.97%)91 (25.07%)178 (49.04%)
Yes80 (22.04%)58 (15.98%)138 (38.02%)
NA28 (7.71%)19 (5.23%)47 (12.95%)

FDR, false discovery rate; NA, not available. P-values < 0.05 are in bold.

Immune differences between the two clusters

CIBERSORT analysis showed the proportions of 22 immune cell types in cluster 1 and cluster 2 (Fig. 4A–4B). The infiltration levels for 12 of these immune cells were significantly different (P < 0.05) between the two subtypes (Fig. 4C). Furthermore, the immune score of cluster 1 was significantly higher than that of cluster 2 (Fig. 4D).

Fig. 4.

Immune differences between the two clusters. (A) Distribution of 22 immune cell types in cluster 1. (B) Distribution of 22 immune cell types in cluster 2. (C) Differences in immune cell infiltration level between the two clusters. *, P < 0.05; **, P < 0.01; ***, P < 0.001; ****, P < 0.0001. (D) Differences in immune score, stromal score and ESTIMATE score between the two clusters.

Pathway differences between clusters 1 and 2

To investigate the differences of enriched pathways between the two clusters, gene set variation analysis (GSVA) was adopted to calculate the enrichment score of each sample. Differential analysis showed that 64 pathways were significantly enriched in cluster 1 and 61 were significantly enriched in cluster 2. The top 10 enriched pathways are described in Fig. 5A. Cluster 1 was significantly enriched in drug and energy metabolism-related pathways, while cluster 2 was enriched in the cell cycle and DNA replication.

Fig. 5.

Differences in pathway, tumor immune dysfunction and exclusion (TIDE) score, and chemotherapy drug response between the two clusters. (A) Enriched TOP10 pathways of cluster 1 and cluster 2. (B) TIDE differences between the two clusters. (C) The most significantly different drug sensitivities between the two clusters.

Immunotherapy and chemotherapy differences between the two clusters

The TIDE (tumor immune dysfunction and exclusion) score of cluster 1 was significantly higher than that of cluster 2 (Fig. 5B), indicating a worse efficacy and poorer survival for these patients after receiving immune checkpoint blockade (ICB) therapy. We next estimated the IC50 of chemotherapeutic drugs for each sample, and found that 37 drugs showed significant differences between the two clusters. The three most relevant drugs were paclitaxel, gemcitabine and camptothecin (Fig. 5C). IC50s of the three drugs in cluster 1 were all significantly lower than those in cluster 2. These predictive results suggest that these chemotherapy drugs will be more effective for patients in cluster 1.

Construction and validation of a drug metabolism-related gene prognostic signature

Finally, LASSO analysis screened six optimal genes (ABCB6, SLCO2A1, CYP2C9, PON1, ABCC5 and ADH4) for establishing a prognostic signature (Fig. 6A). Kaplan–Meier analysis showed that these six genes were significantly related to survival status, with higher expression of ABCB6 and ABCC5 suggesting worse survival (Fig. 6B and 6F), and higher SLCO2A1, CYP2C9, PON1 and ADH4 suggesting better survival (Fig. 6C–6E and 6G). The expression patterns of the six genes in different clinical samples are shown in Fig. 7A; the expression level of ABCB6 and ABCC5 increased with increasing risk score, while the expression level of the other genes decreased with increasing risk score. In addition, higher risk scores included more samples with higher malignancy.

Fig. 6.

Identification of optimal prognostic genes. (A) LASSO algorithm for optimizing the drug metabolism-associated prognostic genes. Left, LASSO coefficient distribution; right, the likelihood bias of LASSO coefficient distribution. In the latter plot, vertical dotted lines represent lambdas.min (left) and lambdas.1se (right). (B–G) Kaplan–Meier analysis for high- and low-expressed ABCB6, SLCO2A1, CYP2C9, PON1, ABCC5 and ADH4.

Fig. 7.

Construction and validation of a risk score model based on the optimal prognostic genes. (A) Expression patterns of the six genes and correlation of risk score with clinical features in the LIHC dataset. (B) Kaplan–Meier analysis, risk score distribution and ROC curves in the LIHC dataset. (C) Expression patterns of the six genes and correlation of risk score with clinical features in the LIRI-JP dataset. (D) Kaplan–Meier analysis, risk score distribution and ROC curves in the LIRI-JP dataset.

A risk score model was constructed based on the six optimal drug metabolism-related prognostic genes. The high-risk group exhibited worse survival than the low-risk group (Fig. 7B, left, P < 0.05). An ROC curve showed that the predicted values of the risk score model at 1, 3 and 5 years were accurate, with AUC values of 0.788, 0.727 and 0.645, respectively (Fig. 7B, right). The expression level of the six genes and the risk score model were validated in the LIRI-JP dataset, and the same results as the outcomes in the LIHC dataset were obtained (Fig. 7C–7D). Specifically, the expression levels of the six genes with increasing risk score were similar to those in the training dataset: ABCB6 and ABCC5 expression increased with increasing risk score; expression of the other four genes decreased with increasing risk score (Fig. 7C). In addition, the high-risk group showed worse survival than the low-risk-group, with AUCs of 0.725, 0.652 and 0.616 at 1, 3 and 5 years, respectively (Fig. 7D).

Finally, we compared the proportion of high- and low-risk samples in the two clusters. The results showed that cluster 1 contained more high-risk samples (74%) (Fig. 8), further indicating the worse prognosis of cluster 1.

Fig. 8.

Sample distribution of high and low risk scores in the two clusters.

DISCUSSION

Liver cancer is a malignancy whose characteristics are highly heterogeneous. The liver is the largest metabolic organ, and metabolic alterations are connected to the development of tumors (Pope et al., 2019). Hypermetabolic liver cancer cells can affect immune cell function and thus facilitate tumor progression (Xu et al., 2022). Here, we identified two prognostic subtypes that were related to drug metabolism, and analyzed the differences in survival, clinical features, immune status and therapy response between the two clusters. Furthermore, a risk score model was established with the optimal prognostic genes and the risk score was demonstrated to be effective in predicting survival. Finally, the subtypes were closely related to the risk score, demonstrating that the subtype categorization in this study showed beneficial predictive ability in relation to survival and drug response. These results should provide indicators for management of patients with liver cancer.

Liver cancer is associated with variation of metabolism, which not only affects cancer development and metastasis but is also related to the immune response and drug resistance, and further influences patient survival (Vaupel et al., 2019; Hu et al., 2020; Siska et al., 2020). We captured a group of ADME genes from a previous report (Hu et al., 2020), and a total of 248 ADME genes were expressed in liver cancer. Among these 248 genes, 239 matched with the dataset used in the present study. Of the 239 metabolism-related genes, 122 were differentially expressed between normal and tumor groups. Further GO and KEGG analysis verified that these genes were closely associated with carcinogenesis and various metabolic processes including drug metabolism, fatty acid metabolism, biosynthesis, antifolate resistance and arachidonic acid metabolism. In the tumor microenvironment, there are many tumor cells and infiltrated immune cells. The glycolytic process of tumor cells may inhibit glucose absorption by immune cells, and thus induce immune escape. Fatty acid metabolism can produce a lot of energy required for tumor cell growth and metastasis, and is associated with tumor progression and drug responses (Hoy et al., 2021). Arachidonic acid is a fatty acid, and has been demonstrated to inhibit the immune response in the tumor environment (McCarty and DiNicolantonio, 2018). Drugs targeting the arachidonic acid pathway have been explored as therapeutic agents in several cancers (Yarla et al., 2016). This evidence collectively shows that our identified genes are closely related to tumor growth and drug metabolism.

Based on these metabolism genes, univariate Cox analysis was performed and identified 54 prognostic genes in liver cancer. These genes were then utilized to group all patients into cluster 1 and cluster 2. In our study, cluster 1 possessed worse survival than cluster 2. Clinical analysis also demonstrated that cluster 1 showed a higher level of malignancy. This may stem from differences in immune cell infiltration level and immune score between the two clusters. Furthermore, the TIDE score in cluster 1 was higher than in cluster 2. A higher TIDE score represents higher potential of tumor immune evasion and less likelihood of benefiting from ICB therapy (Jiang et al., 2018). The results of our study further explained the worse survival of patients in cluster 1. IC50s of 37 drugs were significantly different between the two clusters. Paclitaxel, gemcitabine and camptothecin were most significant. Paclitaxel has been demonstrated to be an active anticancer agent (Panday et al., 1997). In liver tissue, paclitaxel can increase the innate immunity function through activation of the NLRP3 inflammasome in macrophages (Zeng et al., 2019). Paclitaxel is a therapeutic drug against various types of cancer including pancreatic cancer and liver cancer (Goldstein et al., 2015; Wu et al., 2018; Reck et al., 2019). Gemcitabine is commonly used for treatment of solid tumors. Gemcitabine plus oxaliplatin is an effective and safe treatment in patients with advanced hepatocellular carcinoma (Zaanan et al., 2013). Furthermore, gemcitabine inhibits the cell viability and proliferation of hepatocellular carcinoma cells through the Warburg effect (Guo et al., 2019). Camptothecin is a natural alkaloid isolated from Camptotheca acuminata, and is an effective antitumor agent for various cancers such as colon and gastric cancers (Wall and Wani, 1996), and reportedly exhibits a strong inhibitory function on the NRF2–ARE pathway in mice. By lowering NRF2 activity, camptothecin increases the cytotoxicity of chemotherapeutic drugs in cancer cells (Chen et al., 2019). These findings demonstrate that the patients in cluster 1 may be more likely to benefit from chemotherapy. Furthermore, we also found that several genes used to shape the subtype may be related to liver cancer. For example, PPARG is related to metabolism, interacting with oncogenes and promoting tumor progression through activating the WNT/CTNNB1 signaling pathway (To et al., 2021). In addition, MAT1A is expressed in normal liver, and silent in hepatoma tissue. It is reported that overexpression of MAT1A in hepatoma carcinoma cells inhibits tumor growth (Li et al., 2010). The suppression of MAT1A is related to its methylation in tumor tissue (Torres et al., 2000). MAT1A methylation is associated with hepatitis virus infections (Bing et al., 2014). These observations support our proposal that the subtypes established in this study are related to liver cancer progression.

Furthermore, based on the 54 prognostic genes, six genes were finally screened from LASSO analysis. The six genes were ABCB6, SLCO2A1, CYP2C9, PON1, ABCC5 and ADH4. These optimal genes were associated with drug metabolism. ABCB6, a member of the lysosomal ATP-binding cassette transporters, is highly expressed in tumor tissue and its high expression correlates with a poor prognosis (Polireddy et al., 2011; Tsunedomi et al., 2013). Moreover, ABCB6 has been linked with resistance to various drugs in various cancers (Kelter et al., 2007; Hlavata et al., 2012; Kong et al., 2022). SLCO2A1, a prostaglandin transporter, regulates the invasion and apoptosis of lung cancer cells by mediating the PI3K/AKT/mTOR pathway (Zhu et al., 2015). Highly expressed SLCO2A1 can induce cancer cell invasion, and knockdown of SLCO2A1 inhibits the invasion and induces apoptosis. Blocking SLCO2A1 can reduce the tumorigenesis of colon cancer (Nakanishi et al., 2017). More importantly, SLCO2A1 can be potently and selectively inhibited by suramin, which is approved by the FDA (Kamo et al., 2017). CYP2C9, which is involved in the metabolism of cancer drugs and exogenous carcinogens, is downregulated in esophageal squamous cell carcinoma, and can inhibit the invasion and migration of tumor cells via downregulation of HDAC (Jiang et al., 2021). PON1 is a member of the paraoxonase gene family. PON1 is a lipolactonase associated with the elimination of carcinogenic free radicals and maintenance of oxidative balance, and decreased PON1 activity has been found in various cancers (Arenas et al., 2018). ABCC5, known as multidrug resistance protein 5, is negatively correlated with gemcitabine sensitivity in non-small cell lung cancer (Oguri et al., 2006). ADH4 can induce metabolism of a number of substrates including ethanol and retinol. Low expression of ADH4 indicates poor overall survival and has been suggested to be a prognostic biomarker in hepatocellular carcinoma (Wei et al., 2012). Risk scores based on these genes showed adequate predictive ability in relation to survival, and this was validated in the LIRI-JP dataset. The high-risk group showed worse survival, and cluster 1 contained more high-risk samples, further explaining the worse survival of patients in the cluster 1 subtype. These results indicate that the subtypes based on drug metabolism in our study showed effective predictive value for survival and therapy options.

In this study, we found a subtype related to drug metabolism that was effective in predicting survival and drug responses in liver cancer, and established a risk score model to better predict the survival of patients and validate the subtype. These results are useful for drug options in patients with liver cancer. Nevertheless, two limitations are apparent: first, the drug response of the subtype is predicted based on an online database, and validation is desirable in a larger cohort. Second, although the risk score model established in our study has been validated in an external validation dataset, it will benefit from actual clinical validation.

CONCLUSION

The present study finds a drug metabolism-related prognostic subtype that can predict survival and drug responses in liver cancer. Cluster 1 shows worse survival than cluster 2, and cluster 1 patients may derive more benefit from immunotherapy but less benefit from chemotherapy. These results indicate that immunotherapy should be more suitable for cluster 1-type patients, while chemotherapy should be more suitable for cluster 2 patients. Furthermore, the risk score based on the optimal prognostic genes is demonstrated to possess an adequate prognostic ability. The high-risk group showed worse survival; meanwhile, cluster 1 contains the majority (74%) of high-risk samples. These results should be useful for prognosis and specific therapy for patients with liver cancer.

MATERIALS AND METHODS

Data collection

LIHC data with clinical and survival information were downloaded from the UCSC Xena platform (https://toil.xenahubs.net) (Goldman et al., 2019). Samples numbered -11A were taken as paracancer normal tissue samples, and samples numbered -01A were taken as cancer tissue samples. A total of 419 samples (50 paracancer tissue samples and 369 cancer tissue samples) were included, among which 363 cancer tissue samples had prognostic information.

RNA-seq data and corresponding clinical information in the LIRI-JP dataset were collected from the ICGC database (https://dcc.icgc.org/projects/LIRI-JP) (Jennings and Hudson, 2016). A total of 243 cancer tissue samples in 232 patients were included. These data were used to validate subsequent prognostic models.

Identification of drug metabolism genes related to liver cancer

In total, 248 ADME genes that are specifically expressed in liver cancer were obtained from a previous report (Hu et al., 2020). A total of 239 genes (Supplementary Table S1) related to drug metabolism were screened by matching with the genes in the above dataset. The Limma package (Smyth, 2005) (version 3.10.3) was used to conduct a differential expression analysis between the cancer and normal groups. adj.P.Value < 0.05 and |logFC| > 0.5 were set as the thresholds.

Analyses for the drug metabolism genes

The clusterProfiler (version 3.8.1) (Yu et al., 2012) R package was used to conduct GO and KEGG pathway analyses and P < 0.05 was designated as the threshold for significant correlation. The GO analysis contained biological process (BP), cellular component (CC) and molecular function (MF) terms. STRING (version 11.0) (Szklarczyk et al., 2017) was used to conduct a protein–protein interaction (PPI) network analysis. Cytoscape (version 3.6.1) (Shannon et al., 2003) was used to construct a PPT network.

Identification of prognosis-related drug metabolism genes

For samples in LIHC datasets, univariate Cox analysis in the survival package (version 2.41-1) (Wang et al., 2016) was used to screen prognosis-related genes from the above differentially expressed metabolism genes. Genes with P < 0.05 were selected as significant prognostic genes related to drug metabolism.

Identification of drug metabolism subtypes

Based on the prognostic genes, ConsensusClusterPlus (Wilkerson and Hayes, 2010) was used to divide the patients in LIHC into different subtypes. All the tumor samples were divided into two subtypes. Kaplan–Meier analysis in the survival package was further conducted to compare survival between the different subtypes. At the same time, the clinical information was compared between the identified subtypes, and the Chi-square test was used to determine statistical significance.

Immune microenvironment analysis between subtypes

To further investigate variation in immune status among the different subtypes, CIBERSORT (Chen et al., 2018; Kawada et al., 2021) was used to calculate the proportions of each of 22 immune cell types. Differences between the subtypes were then compared using the Wilcoxon test. Furthermore, the immune score and stromal score were calculated using the ESTIMATE (Yoshihara et al., 2013) algorithm, and different immune and stromal scores between subtypes were compared using the Wilcoxon test.

Pathway analysis between subtypes

The GSVA algorithm and R package GSVA (version 1.36.2) (Hänzelmann et al., 2013) were adopted to calculate the enrichment score of each pathway in the molecular signatures database (MSigDB) (Liberzon et al., 2011). The Limma package was used to analyze differences in the pathways between subtypes. Significantly enriched pathways were defined as having P < 0.05.

Immunotherapy and chemotherapy analysis between subtypes

To further investigate the differences between the subtypes, the TIDE score (Jiang et al., 2018) was adopted to evaluate the efficacy of immunotherapy. A high TIDE score suggests poor efficacy of ICB therapy and short survival after ICB therapy. To estimate the sensitivity of each subtype to common chemotherapeutic drugs, the pRRpphetic algorithm (Geeleher et al., 2014) was employed to predict the IC50 of chemotherapeutic drugs. Wilcoxon tests were conducted to calculate P values of TIDE and IC50 between subtypes.

Construction and validation of the drug metabolism-related prognostic model

Twenty-fold LASSO Cox analyses in the glmnet package were used to screen the optimal prognostic genes related to drug metabolism. The identified optimal genes were used to construct the risk score model:   

Risk score= β gene×Exp gene
where β gene represents the regression coefficient of LASSO, and Exp gene represents the expression level of the relevant gene in the LIHC dataset.

The samples in the LIHC dataset were grouped into high- and low-risk groups by the median of risk score. Kaplan–Meier analysis was then used to compare survival between the two groups. For further validating the accuracy of the risk score model, the same regression coefficient was used to calculate the risk score value of each sample in the LIRI-JP dataset.

Finally, the proportion of high- and low-risk samples in the two clusters was determined.

DECLARATIONS

Funding: None.

Conflicts of interest: The authors declare that they have no conflicts of interest.

Authors’ contributions: Conception and design of the research: Y. L., Y. Z.; acquisition of data: Y. Z., J. C., C. H.; analysis and interpretation of data: Y. L., Y. Z., J. C., C. H.; statistical analysis: Y. Z., J. C., X. H.; drafting the manuscript: Y. Z., J. C., C. H.; revision of manuscript for important intellectual content: Y. L., X. H. All authors read and approved the final manuscript.

REFERENCES
 
© 2023 The Author(s).

This is an open access article distributed under the terms of the Creative Commons BY 4.0 International (Attribution) License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits the unrestricted distribution, reproduction and use of the article provided the original source and authors are credited.
https://creativecommons.org/licenses/by/4.0/legalcode
feedback
Top