2023 Volume 41 Issue 1 Pages 1-9
Proper selection of the internal control gene is important for ensuring the accuracy of expression data obtained by quantitative real-time PCR. The expression of the internal control gene must remain constant under different experimental conditions. Typical validation methods include comparing the copy number per quantity of RNA or cDNA and normalizing multiple candidate internal control genes to each other. Both methods have advantages and disadvantages. Validation of internal control expression is essential for ensuring accurate experimental results. Even for frequently used internal controls such as β-actin and GAPDH, validation is necessary to ensure that no fluctuations in expression occur under different experimental conditions. In this report, we indicate concrete examples of the risks involved in normalizing candidate internal control genes to each other and how to complement this method. Internal control genes should be selected by establishing multiple combinations of candidate genes and comparing the copy number per quantity of RNA or cDNA.
Quantitative real-time PCR (qRT-PCR) is widely used to analyze gene transcript levels. In the pharmacy field, qRT-PCR is used to compare gene expression among various samples. Normalization of target gene expression to the expression of an internal control is an essential step in qRT-PCR. The accuracy of the results depends on the selection of an appropriate internal control. Internal controls should exhibit abundant, uniform expression under a variety of experimental conditions1). Housekeeping genes, such as β-actin (ACTB), glyceraldehyde-3-phosphate (GAPDH), and 18S ribosomal RNA (18S rRNA), were used as internal controls for agarose gel electrophoresis, which used to be the mainstream method for semi-quantitative analysis of gene expression. Even though qRT-PCR technology is more advanced, and the accuracy of quantitative analysis has improved, ACTB2,3), GAPDH2,3), and 18S rRNA2,4) are still frequently used as internal controls. However, the expression of these internal controls may fluctuate depending on the experimental conditions; therefore, these genes may no longer be suitable as internal controls. In order for a gene to be suitable as an internal control, its expression level must remain constant among different samples.
In particular, GAPDH is reported to be unsuitable as an internal control because its expression is relatively variable compared with that of other internal controls5,6). The Minimum Information for Publication of Quantitative Real-Time PCR experimens (MIQE) guidelines, established to ensure the reproducibility of qRT-PCR, state that the internal control must be carefully selected from among multiple candidate genes with stable expression7,8). A suitable internal control must exhibit stable expression with little variation under the target conditions and in the target samples. To identify a suitable internal control, multiple candidate genes should be considered. To rule out the possibility of fluctuations in expression, the number of qRT-PCR amplification cycles (Ct) or the copy number per quantity RNA or cDNA should be compared among the candidate genes, and the expression values of multiple candidate genes should be normalized.
Applications such as geNorm9), Normfinder10,11) BestKeeper9,10), and Reffinder11) have been used to evaluate the expression stability of candidate genes. However, different applications may produce different results. Furthermore, some of the applications lack an explicit algorithm, which contains necessary information. Therefore, in some cases, the internal control gene is selected without using these applications.
In this study, we investigated the risks involved in selecting the internal control gene without using an application. We determined whether the expression of candidate genes selected as internal controls varied between two-dimensional (2D) and three-dimensional (3D) cell cultures and analyzed the risk of selecting an inappropriate internal control.
The human esophageal squamous cell carcinoma cell lines KYSE30 (Lot# JCRB0188, JCRB Cell Bank, Ibaraki, Osaka, Japan), KYSE70 (Lot# 11J032, Health Protection Agency, Salisbury, UK; transported in a frozen state), and KYSE140 (Lot# JCRB1063, JCRB Cell Bank, Ibaraki, Osaka, Japan)12,13,14) were maintained in RPMI 1640 medium (Sigma, Saint Louis, MO, USA) or DMEM (Sigma, Saint Louis, MO, USA) supplemented with 10% fetal bovine serum and 1% Antibiotic-Antimycotic (Life Technologies, Grand Island, NY, USA) at 37 ºC in a CO2 incubator, 72 hours (Fig. 1A). To establish the 3D culture, a 100 μL cell suspension (1.0 × 104 cells/mL; final concentration, 1000 cells/well)15) was transferred to a Prime Surface 96 Well V plate (Sumitomo Bakelite, Tokyo, Japan). To establish the 2D culture, a 3 mL cell suspension (1 × 105 cells/mL; final volume, 3 × 105 cells/culture dish) was transferred to a 60 mm cell culture dish (CORNING, Corning, NY, USA).
Esophageal cancer cells and primer position, sequences.
A: Esophageal cancer cells in 2D and 3D culture, 72 hours. Scale bars indicate 100 μm. B: Primers for all genes except 18S rRNA were designed so that either the forward or reverse primers straddled different exons. The sequences were retrieved using the National Center for Biotechnology Information accession numbers.
We selected ACTB, 18S rRNA, GAPDH, and TATA-binding protein (TBP) as candidate internal controls16,17). The primer positions and sequences are shown in Figure 1. The primers were designed so that either the forward (F) primer, reverse (R) primer, or both primers straddled the exon junctions. The length of the PCR products was established as 100~170 bp.
RNA extraction, cDNA synthesis, and qRT-PCRTotal RNA was extracted from the 2D and 3D cultures using the RNeasy Mini Kit (Qiagen, Dusseldorf, Germany) following the manufacturer’s instructions. cDNA synthesis was performed using a cDNA synthesis kit (Roche, Basel, Switzerland). Using first-strand cDNA (50 ng), qRT-PCR was performed with a 96-well optical plate and the 7500 Real-Time PCR system (Applied Biosystems, Tokyo, Japan). The copy number of the target cDNA fragments was calculated based on the threshold cycle (Ct) value, and standard curves for each gene were created by determining the copy number at different concentrations.
Measurement of samples and determination of DNA concentrationA schematic of the sample preparation for the analysis is shown in Figure 2. We prepared five experimentally independent cDNA samples. The cDNA was dispensed from one independent sample into five wells for measurement, and the measurements of the five wells were averaged to obtain the value for a sample. The average of the five independent sample values was then used as the sample value for the condition (e.g., 3D KYSE70 TBP/ACTB). The data shown in Fig. 3, Fig. 4A, and Fig. 4C were obtained using the averaged five sample values. For the statistical analysis shown in Figure 6B, ANOVA was performed on samples 1–5 under the same conditions using the Ct values plotted in Fig. 6A. As shown in Fig. 6C, the sample groups (3D KYSE70 GAPDH, 3D KYSE140 TBP) for which the null hypothesis was rejected as a result of ANOVA were analyzed using Tukey’s multiple comparison test.
Data determination scheme and normalization scheme for qRT-PCR samples.
Five measurements (in five wells, one per well) were performed per sample, and the copy concentrations (Conc.) were calculated based on the obtained Ct values. The five calculated copy concentrations were averaged to obtain the measured value for that sample. Five samples were measured in this manner and normalized. The five normalized values were then averaged to obtain the average value shown in Fig. 3. The normalization scheme is shown. Numer denotes the numerator, and Denom denotes the denominator.
Comparison of 2D and 3D normalized data.
The normalized data for KYSE30, KYSE70 and KYSE140 cells are plotted as a bar graph. Y-axis values indicate the average normalized value (n = 5). The open bars and closed bars indicate the data from 2D cells and 3D spheroids, respectively. The asterisk (*) indicates a significant difference (p < 0.05).
Comparison of the fluctuation ratio of GAPDH and TBP expression between 2D cells and 3D spheroids in KYSE30 cells.
A: Comparison of the copy numbers/ng cDNA of GAPDH and TBP in KYSE30 cells. The asterisk (*) indicates a significant difference (p < 0.05). B: Expression variation of GAPDH and TBP in each sample. C: Fluctuation ratio (3D/2D) of GAPDH and TBP expression in KYSE30 cells.
Statistical analysis of sample homogeneity under equivalent conditions.
A: Plot of Ct value/ng cDNA for ACTB, 18S, GAPDH and TBP in KYSE70 and KYSE140. Five measurements were made per sample. B: The results of analysis of variance for samples under equivalent conditions. If the null hypothesis is rejected, the sample group is not homogeneous. C: Results of the multiple comparisons test for GAPDH in KYSE70 cells and TBP in KYSE140 cells.
The average of the five normalized values was calculated and compared between the 2D and 3D cultures (Fig. 3). An asterisk indicates a significant difference between the normalized data for the 2D and 3D cultures. When significant differences occurred, the expression of either gene or both genes in normalized combination may have varied between the 2D and 3D cultures. Significant differences in expression were observed for GAPDH/ACTB, TBP/ACTB, GAPDH/18S, and TBP/18S in KYSE30 cells. This result may indicate that GAPDH, ACTB, TBP, and 18S all exhibited variable expression in KYSE30 cells. Conversely, no significant differences were detected for 18S/ACTB and TBP/GAPDH in KYSE30. However, when the copy numbers were compared, the expression of both GAPDH and TBP significantly increased in 3D spheroids (Fig. 4A). In addition, GAPDH and TBP exhibited a trend of increased expression in all 3D spheroid samples (Fig. 4B). When the magnitudes of the changes in each sample were averaged, the expression of both GAPDH and TBP in 3D spheroids exhibited an approximately 2-fold increase (Fig. 4C). The normalized data for TBP/GAPDH indicated that the expression levels of GAPDH and TBP were equal between the 2D and 3D cultures. In contrast, when copy number was calculated, the expression levels of both GAPDH and TBP in the 3D spheroids were almost twice that in the 2D cell culture (Fig. 4C). Because GAPDH and TBP expression fluctuated by the same ratio between the 2D and 3D cultures, normalization of GAPDH and TBP expression resulted in approximately equal values (Fig. 3), with no apparent variation. However, TBP and GAPDH are not suitable as internal controls because their expression levels varied between the 2D and 3D cultures.
Fluctuations in expression under equivalent conditionsThe normalized data for GAPDH/ACTB and GAPDH/18S in KYSE70 cells and TBP/ACTB, TBP/18S in KYSE140 cells are plotted for each sample (samples 1–5; Fig. 5A). The values shown in Fig. 3 are the average values of the normalized data from five samples. When the normalized values for GAPDH/ACTB and GAPDH/18S expression in KYSE70 cells were averaged, more sample variability was observed in the 3D culture than in the 2D culture. In KYSE70 cells, TBP/ACTB and TBP/18S expression exhibited variation in both the 2D and 3D cultures (Fig. 5A). We also compared how many fold changes were detected in the copy number of cDNA for each gene between 3D and 2D cultures in KYSE70 and KYSE140. The ratio of changes in TBP expression in KYSE140 cells and GAPDH expression in KYSE70 cells was found to vary from sample to sample (Fig. 5B).
Plot of normalized data and fluctuation ratio of samples under equivalent conditions.
A: Plot of normalized values for GAPDH/ACTB and GAPDH/18S in KYSE70 cells and TBP/ACTB and TBP/18S in KYSE140 cells. B: Plot of fluctuation ratio (3D/2D) of candidate gene expression in KYSE30 and KYSE140 cells.
The Ct values were determined by averaging of five measurement for a sample. Figure 6A plots the each measurement of the Ct determination data, five measurements of ACTB, 18S, GAPDH and TBP in KTSE70 and KYSE140. The values of most genes are plotted within a Ct value range of 1 to 2 cycles, however, for GAPDH in the 3D KYSE70 and TBP in the 3D KYSE140 vary in the Ct value range of 6 to 7 cycles. ANOVA and multiple comparison procedure were performed on these data to analyze whether these were heterogeneous sample groups for the same gene and culture condition (2D or 3D culture).
No significant differences were identified via ANOVA; the null hypothesis, “There is no difference in the average value of each sample”, was not rejected in terms of GAPDH expression in the 2D KYSE70 cell culture or TBP expression in the 2D KYSE140 cell culture (Fig. 6B). Conversely, when the 3D culture data was analyzed, significant differences were observed; the null hypothesis was rejected, and the alternative hypothesis, “Not all averages are equal,” was adopted (Fig. 6B). Tukey’s post hoc test was used to identify the samples that had significant differences (Fig. 6C). These results indicated that even under the same experimental conditions, gene expression differed between samples in some cases. Furthermore, even if the controlled variables are equivalent, uncontrollable factors may vary, causing differences in the observed expression levels.
In qPCR analysis, internal control is an important factor that influences the results. The internal control must not exhibit any variation in expression under different experimental conditions18). Differences in expression, or unstable expression, greatly affect experimental results. The unstable expression are shown in Figure 5 and 6, expression levels may vary among samples under the same experimental conditions.
In this case, there is a risk that the null hypothesis is not rejected in the t-test, which would result in a finding of no significant difference. Although this “no significant difference” is not a guarantee of “no change in expression”, the results of statistical analysis may lead to a wrong determination of “no change in expression”.
We identified two cases in which variation exists that may disqualify a gene as an internal control, even if the variation does not qualify as a statistically significant difference. The gene is unsuitable as an internal control, (1) if the normalized expression values of two genes vary in the same ratio under different experimental conditions (Fig. 4), or (2) if the expression of a candidate gene exhibits significant variation between samples under the same experimental conditions (Fig. 5 and 6). Variation in candidate internal control expression between samples has been reported previously19,20,21).
When comparing two experimental conditions (2D and 3D cultures) by t-test (Fig. 3), the null hypothesis is, “There is no difference in the expression level between the 2D and 3D cultures”, and the alternative hypothesis is, “There is a difference in the expression level between the 2D and 3D cultures”. If the null hypothesis is rejected, the difference is considered significant. Conversely, if the probability that the null hypothesis is true is greater than the significance level, the null hypothesis is not rejected. However, this does not mean that the null hypothesis is true. To conclude that there is no difference in expression because there is no significant difference in the normalized values is incorrect experimentally because of the two cases mentioned earlier, and furthermore, it is an incorrect interpretation of the statistical analysis. Thus, when multiple samples are analyzed, the suitability of a gene as an internal control cannot be accurately determined based only on the normalized values, average values, and presence or absence of significant differences. Furthermore, it is also inappropriate to use an internal control gene reported in the literature for novel experiments without validation. Ayakannu et al. warned against using internal control genes without validation16). Three cell lines derived from human esophageal cancer, KYSE30, KYSE70, and KYSE14012,13,14), were used in the present study. Since the expression patterns of candidate internal control genes differed even within individual cell lines, it is necessary to validate candidate genes in each cell line. As shown in this report, even in allogeneic cells, expression of candidate genes may lose stability due to differences in 2D and 3D culture conditions. We suggest that the following points should be considered when determining control genes in quantitative RT-PCR.
1. To prepare multiple primers for one gene to select a control gene.
2. The process of control gene determination should involve preparing several samples under the same conditions to measure the gene.
3. Determination of measurements in a single sample should be determined using multiple measurements rather than a single measurement.
Confirming that there is no significant variation in the measured values through these experiments is essential for the selection of control genes. In each independent experiment, we must consider the risk that the model organism or cell exists in a different environment that we cannot perceive, even if we set the same conditions at the experimental note level. Hence, the robustness of gene expression should be confirmed by preparing several independent samples for the determination of control genes.
Over 90% of the qRT-PCR analyses published in high-impact journals used only one internal control22). Schmid et al. recommend the use of two or more internal controls23). Internal controls should be carefully selected from several combinations of candidate genes, based on normalized values and copy numbers per quantity of cDNA. The use of two or more internal control genes increases the reliability of an experiment.
This research was supported by a grant from the Keiryokai Research Foundation (No. 9245006).
The author declares no competing financial interest.