2023 Volume 30 Issue 5 Pages 467-480
Aims: Genetic and medical insights from studies on cardioprotective phenotypes aid the development of novel therapeutics. This study identified genetic variants associated with supernormal coronary arteries using genome-wide association study data and the corresponding genes based on expression quantitative trait loci (eQTL).
Methods: Study participants were selected from two Korean cohorts according to inclusion criteria that included males with high cardiovascular risk (Framingham risk score ≥ 14, 10-year risk ≥ 16%) but with normal coronary arteries (supernormal group) or coronary artery disease (control group). After screening 12,309 individuals, males meeting the supernormal phenotype (n=72) and age-matched controls (n=94) were enrolled. Genetic variants associated with the supernormal phenotype were identified using Firth’s logistic regression, and eQTL was used to evaluate whether the identified variants influence the expression of particular genes in human tissues.
Results: Approximately 5 million autosomal variants were tested for association with the supernormal phenotype, and 10 independent loci suggestive of supernormal coronary arteries (p<5.0×10−5) were identified. The lead variants were seven intergenic single-nucleotide polymorphisms (SNPs), including one near PBX1, and three intronic SNPs, including one in PPFIA4. Of these variants or their proxies, rs9630089, rs6427989, and rs4984694 were associated with expression levels of SLIT1 and ARHGAP19, PPFIA4, and METTL26 in human tissues, respectively. These eQTL results supported their potential biological relevance.
Conclusions: This study identified genetic variants and eQTL genes associated with supernormal coronary arteries. These results suggest candidate genes representing potential therapeutic targets for coronary artery disease.
See editorial vol. 30: 434-436
Genetic factors substantially affect the development of atherosclerotic cardiovascular disease1). In the previous decade, numerous studies actively investigated genetic variants contributing to this risk, finding that many of them are related to traditional clinical factors, whereas others are linked to noncoding genetic regions and uncertain pathways2-5). Additionally, studies focused on individuals protected from vascular disease identified loss-of-function variants in ANGPTL3 by family analysis6) or in PCSK9 7); these studies were performed using people nonspecifically registered and analyzed with regard to the presence of vascular disease. Genes with variants associated with a protective phenotype usually belong to pathways related to lipid metabolism. Genetic and medical insights obtained from these studies were fundamental to the successful development of targeted therapeutics, such as ANGPTL3 or PSCK9 monoclonal antibodies against atherosclerotic vascular disease8).
In a recent study, supernormal individuals with respect to arterial stiffness showed better clinical outcomes, including those related to cardiovascular events, than the control group9). Additionally, biological pathways related to metabolic homeostasis are reportedly associated with resistance to obesity and vascular disease10). Although individuals’ factors affecting these phenotypes may vary, these factors have not been reported in detail. Researchers speculate that genetic factors are partly involved in disease resistance in these individuals11), and a recent study involving such protective phenotypes was conducted in patients with diabetes resistant to diabetic retinopathy12). However, no genetic studies on a protective vascular phenotype have been undertaken using individuals with high cardiovascular risk scores based on a combination of risk factors.
In this study, we aimed to identify genetic variants associated with supernormal coronary arteries using genome-wide association study (GWAS) data and the corresponding genes using expression quantitative trait loci (eQTL). The supernormal phenotype was defined as normal coronary arteries in individuals with high cardiovascular risk, whereas the control phenotype was the presence of coronary artery disease in those with the same risk.
This study was performed in accordance with the Declaration of Helsinki and received proper ethical oversight. The Institutional Review Board of a university hospital, Seoul, Korea, approved the research protocol (4-2016-0979). We selected male participants from two cohorts: The Cardiovascular Genome Center (CGC) cohort and Cardiovascular and Metabolic Disease Etiology Research Center-High Risk (CMERC-HI) cohort at a university, Seoul, Korea. Because most of the study participants presenting a high calculated cardiovascular risk were male, females were excluded. Written informed consent was obtained from all participants. All patients who visited the Division of Cardiology at a university hospital, for a health check-up or due to chest symptoms from January 2001 to August 2009 were included in the CGC cohort. The CMERC-HI cohort enrolled patients with high cardiovascular risk at a university hospital, as previously described13). The inclusion criteria for the CMERC-HI cohort were patients with hypertension and reduced estimated glomerular filtration rate, patients with diabetes and albuminuria, and individuals with carotid plaque or increased carotid intima-media thickness. In the two cohorts, individuals underwent coronary calcium scoring for the health check-up or coronary angiography, as needed, for their chest symptoms.
Inclusion criteria for the present study were males with high cardiovascular risk but normal coronary arteries (supernormal group) or with coronary artery disease (control group). Participants for the supernormal group were selected from the two cohorts, and those for the control group were selected by age matching with a 1:1.3 ratio from the CGC cohort. Cardiovascular risk was calculated using the Framingham risk score (https://www.framinghamheartstudy.org/fhs-risk-functions/hard-coronary-heart-disease-10-year-risk/)14), and subjects with scores ≥ 14 (and thus 10-year risk ≥ 16%) were considered to have high risk and enrolled in the study. This risk calculator defines cardiovascular events as hard outcomes, including only myocardial infarction or coronary death. We defined normal coronary arteries as those with no visible plaques in coronary angiography or with a coronary calcium score of 0 in computed tomography. Coronary artery disease was defined as luminal stenosis ≥ 50% in one or more epicardial coronary arteries.
Among 3470 individuals in the CMERC-HI cohort, 1569 males underwent coronary calcium scoring; 65 individuals with a study-feasible Framingham score but showing a score of 0 for coronary calcium were enrolled as supernormal. Among 8839 individuals in the CGC cohort, 4425 males received coronary angiography; 7 individuals showing a study-feasible Framingham score but normal coronary arteries were enrolled as supernormal. Additionally, 94 male participants with a study-feasible Framingham score and coronary artery disease were age-matched from the CGC cohort and enrolled as controls. Enrollment flow is described in Supplemental Fig.1.
Enrollment of the study population
DNA samples of study participants were genotyped on the Korea Biobank Array (Theragen, Seoul, Korea) optimized for the Korean population15). Sample- and variant-level QC of the genotyped data was performed, and samples that included related individuals (second-degree or closer relationships) were identified using KING (v. 2.5)16) and excluded. Samples were excluded based on multiple criteria, including call rate <95%, a heterozygosity rate five standard deviations (SD) away from the mean, and discordance between reported sex and inferred sex based on the heterozygosity rate on chromosome X. After excluding low-quality samples, variants with a call rate <98%, minor allele frequency (MAF) <1%, or showing deviation from Hardy–Weinberg equilibrium (p<1.0×10−6) were excluded. Principal component analysis (PCA) with common (MAF ≥ 5%) genetic variants identified no outlier samples (the first three components of all samples were within five SD of the mean). After QC, the genotype data was phased using Eagle (v.2.3)17) and imputed on the Haplotype Reference Consortium (r.1.1 2016) reference panel using Minimac (v.4.0)18). Genetic variants with a low imputation quality score (r2<0.8) or MAF <1% were excluded for reducing false-positive imputation results.
Genome-Wide Association AnalysisThe associations of genetic variants with the supernormal trait were tested using Firth’s logistic regression to reduce false-positive associations of low-frequent variants due to the small allele count. Age and the first four principal components of genetic ancestry were adjusted as covariates in the regression model. To identify independent variants, linkage disequilibrium clumping (r2<0.1) was conducted with GWAS results. Considering false-negative findings given the small sample size, variants reaching an association p<5.0×10−5 were considered significant. For the top variants of each significant locus, the permutation test for Firth’s regression coefficients using the Monte Carlo method was performed to verify GWAS results. Permutation testing evaluated how often the observed significance would arise by chance if these analyses were repeated and if there were no true-positive findings19). The nearest genes and functional consequences of the top variants were annotated using ANNOVAR20).
eQTL Analysis using the Genotype-Tissue Expression (GTEx) Project DataeQTL data from the GTEx project21) were used to identify variants associated with gene expression among GWAS lead variants or their proxies (LD r2>0.8). For GWAS lead variants or their proxies, a total of 265 genes were tested for cis-eQTL associations in relevant tissues (coronary artery, whole blood, and subcutaneous adipose and visceral adipose tissues). Cis-eQTL results that passed Bonferroni’s correction were considered significant (cis-eQTL p<0.05/265).
The median age of the study population was 67 years, 39% of the participants had diabetes, and >50% of the population were hypertensive or current smokers. Median levels of total cholesterol and high-density lipoprotein-cholesterol were 174 and 38 mg/dL, respectively, and the median Framingham risk score was 15. The supernormal group showed higher frequencies of hypertension and smokers, lower eGFR, and higher Framingham risk scores than the control group (Table 1), whereas other parameters were similar between the two groups.
Total (n= 166) | Supernormal (n= 72) | Control (n= 94) | p | |
---|---|---|---|---|
Age, y | 67 (60–72) | 68 (61–73) | 65 (60–71) | 0.41 |
Diabetes mellitus | 64 (38.6) | 29 (40.3) | 35 (37.2) | 0.69 |
Hypertension | 125 (75.3) | 66 (91.7) | 59 (62.8) | <0.001 |
Smoking | 101 (60.8) | 56 (77.8) | 45 (47.9) | <0.001 |
BMI, kg/m2 | 24.9 (23.1–26.8) | 25.4 (23.3–26.7) | 24.6 (23.0–26.9) | 0.52 |
Systolic blood pressure | 132 (120–141) | 133 (124–142) | 130 (120–140) | 0.85 |
eGFR, mL/min/1.73m2 | 71 (57-86) | 63 (36-86) | 75 (63-86) | 0.014 |
Total cholesterol, mg/dL | 174 (156–204) | 175 (156–194) | 173 (159–212) | 0.41 |
Triglycerides, mg/dL | 132 (93–195) | 119 (92–200) | 139 (96–192) | 0.94 |
HDL-C, mg/dL | 38 (34–44) | 39 (34–44) | 38 (34–44) | 0.53 |
Framingham risk score | 15 (14–16) | 15 (15–16) | 15 (14–15) | <0.001 |
BMI, body mass index; eGFR: estimated glomerular filtration rate; HDL-C, high-density lipoprotein-cholesterol.
Values are presented as medians (interquartile range) for non-normal continuous data and numbers (%) for binary data; p values represent the results of the Wilcoxon-rank sum test for continuous data and Pearson’s chi-squared test for categorical data.
For the GWAS (Fig.1A), after imputing genotype data using the Haplotype Reference Consortium reference panel, we tested 5,197,138 autosomal variants using age and the first four PCs of genetic ancestry as covariates. PCA showed no evidence of population stratification despite the subject selection from two cohorts (Supplemental Fig.2). Despite the lack of statistical power according to a genomic inflation factor (λGC) of 0.892, we identified 10 independent loci with suggestive associations (p<5.0×10−5) with the supernormal group (Table 2 and Supplemental Table 1; Supplemental Figs.3 and 4). The observed significance levels of 10 lead single-nucleotide polymorphisms (SNPs) in each locus were verified under the empirical null distribution by permutation test using the Monte Carlo method (p<9.0×10−6). In each cohort of the supernormal group, effective allele frequencies of the lead SNPs were larger than those of the control group. There was no considerable difference between the two supernormal cohorts (Supplemental Fig.5). Seven loci were intergenic, including a SNP near PBX1, and three were intronic, including SNPs in PPFIA4 and ARHGAP19-SLIT1. Genotypic patterns of the 10 lead variants in the study participants are presented in Fig.1B.
(A) A Manhattan plot [−log10(p)] for supernormal coronary arteries is shown. The red horizontal line corresponds to the threshold for suggestive significance (p=5×10−5), and blue dots denote significant loci according to GWAS findings. The nearest genes to the top variants are shown over each locus. Genes showing significant expression quantitative trait loci associations are in green. (B) Genotypes of lead single-nucleotide polymorphisms (y axis) in study participants (x axis) are shown. The color of each tile represents the number of affected alleles related to supernormal coronary arteries.
The first three PCs of genetic ancestry are presented as the mean±3 SD (blue dashed lines) and the mean±5 SD of each PC (red lines). Each dot represents a participant in the study. Red circles, green X dots, and blue X dots represent the control from CGC cohort, supernormal from CGC cohort, and supernormal from CMERC-HI cohort, respectively.
Abbreviations: CGC, Cardiovascular Genome Center; CMERC-HI, Cardiovascular and Metabolic Disease Etiology Research Center - High Risk
rsID | GRCh37 | Near genes or eQTL genes | Function | A1 | A2 | OR | L95 | U95 | p | p permutation | EAF | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Super normal | Control | |||||||||||
rs10799925 | 1:163480019 | LOC100422212, PBX1 | Intergenic | T | G | 3.185 | 1.866 | 5.435 | 2.22×10-5 | 3.8×10-6 | 0.674 | 0.558 |
rs6427989 | 1:203019737 | PPFIA4* | Intronic | A | G | 4.032 | 2.169 | 7.463 | 1.02×10-5 | 1.0×10-6 | 0.833 | 0.612 |
rs9710701 | 2:101344222 |
LINC01849, NPAS2 |
Intergenic | C | T | 4.505 | 2.217 | 9.174 | 3.14×10-5 | 2.4×10-6 | 0.917 | 0.718 |
rs1415914 | 6:141908348 | MIR4465, NMBR | Intergenic | T | A | 2.843 | 1.726 | 4.686 | 4.12×10-5 | 8.6×10-6 | 0.424 | 0.340 |
rs1538950 | 10:31042464 |
LINC02644, ZNF438 |
Intergenic | T | C | 2.927 | 1.752 | 4.887 | 4.06×10-5 | 9.0×10-6 | 0.479 | 0.250 |
rs9630089 | 10:98968967 |
SLIT1*, ARHGAP19* |
Intronic (ncRNA) | G | A | 3.158 | 1.817 | 5.489 | 4.55×10-5 | 8.2×10-6 | 0.486 | 0.261 |
rs1847474 | 11:38796040 |
LINC01493, LRRC4C |
Intergenic | C | T | 3.179 | 1.846 | 5.473 | 3.02×10-5 | 5.6×10-6 | 0.410 | 0.181 |
rs3002220 | 13:25685942 | PABPC3, AMER2 | Intergenic | C | T | 3.040 | 1.818 | 5.076 | 2.19×10-5 | 3.0×10-6 | 0.750 | 0.505 |
rs12587912 | 14:37186791 | SLC25A21 | Intronic | A | G | 3.070 | 1.817 | 5.189 | 2.79×10-5 | 4.6×10-6 | 0.410 | 0.176 |
rs663580 | 16:890470 |
PRR25, LMF1 (METTL26*) |
Intergenic | C | G | 3.390 | 1.972 | 5.814 | 9.97×10-5 | 1.8×10-6 | 0.722 | 0.532 |
GRCh37, Genome Reference Consortium Human Build 37; A1, effective allele; A2, non-effective allele; OR, odds ratio of each SNP estimated by Firth’s logistic regression; L95, lower bound of the 95% confidence interval of the odds ratio; U95, upper bound of the 95% confidence interval of the odds ratio; p, p value of the odds ratio estimated by Firth’s logistic regression; p permutation, p value of the odds ratio estimated by permutation test; EAF, effective allele frequency in the sample.
Genes with significant eQTL are denoted by an asterisk.
rsID | GRCh37 | A1 | A2 | Nearest genes | Function | N | OR | LOG (OR)_SE | L95 | U95 | P | frq.A1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
rs663580 | 16:890470 | G | C | PRR25;LMF1 | intergenic | 166 | 0.295 | 0.276 | 0.172 | 0.507 | 9.97E-06 | 0.4217 |
rs6427989 | 1:203019737 | G | A | PPFIA4 | intronic | 166 | 0.248 | 0.316 | 0.134 | 0.461 | 1.02E-05 | 0.2922 |
rs9921151 | 16:889560 | T | C | PRR25;LMF1 | intergenic | 166 | 0.307 | 0.271 | 0.180 | 0.522 | 1.31E-05 | 0.3855 |
rs4984938 | 16:892018 | A | C | PRR25;LMF1 | intergenic | 166 | 0.326 | 0.264 | 0.194 | 0.547 | 2.18E-05 | 0.3705 |
rs4984694 | 16:891534 | T | C | PRR25;LMF1 | intergenic | 166 | 0.326 | 0.264 | 0.194 | 0.547 | 2.18E-05 | 0.3705 |
rs2994896 | 13:25688898 | G | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs3002220 | 13:25685942 | T | C | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994888 | 13:25686069 | C | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994891 | 13:25687231 | C | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994905 | 13:25691248 | C | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs3002221 | 13:25688642 | T | C | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994903 | 13:25690632 | A | G | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994901 | 13:25690337 | G | C | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994900 | 13:25690006 | C | A | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994893 | 13:25687738 | A | G | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994895 | 13:25688106 | C | A | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2487596 | 13:25689163 | A | C | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994899 | 13:25689121 | T | A | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs3002222 | 13:25690681 | T | C | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs9511642 | 13:25690775 | C | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs2994898 | 13:25688985 | C | T | PABPC3;AMER2 | intergenic | 166 | 0.329 | 0.262 | 0.197 | 0.550 | 2.19E-05 | 0.3946 |
rs10799925 | 1:163480019 | G | T | LOC100422212;PBX1 | intergenic | 166 | 0.314 | 0.273 | 0.184 | 0.536 | 2.22E-05 | 0.4578 |
rs731493 | 16:889509 | A | G | PRR25;LMF1 | intergenic | 166 | 0.325 | 0.265 | 0.193 | 0.547 | 2.31E-05 | 0.3886 |
rs12587912 | 14:37186791 | A | G | SLC25A21 | intronic | 166 | 3.070 | 0.268 | 1.817 | 5.189 | 2.79E-05 | 0.2771 |
rs6427740 | 1:163478748 | C | G | LOC100422212;PBX1 | intergenic | 166 | 0.327 | 0.267 | 0.194 | 0.552 | 2.84E-05 | 0.4639 |
rs4657294 | 1:163478287 | A | G | LOC100422212;PBX1 | intergenic | 166 | 0.327 | 0.267 | 0.194 | 0.552 | 2.84E-05 | 0.4639 |
rs6427737 | 1:163477733 | G | A | LOC100422212;PBX1 | intergenic | 166 | 0.327 | 0.267 | 0.194 | 0.552 | 2.84E-05 | 0.4639 |
rs7535284 | 1:163477356 | C | T | LOC100422212;PBX1 | intergenic | 166 | 0.327 | 0.267 | 0.194 | 0.552 | 2.84E-05 | 0.4639 |
rs1847474 | 11:38796040 | C | T | LINC01493;LRRC4C | intergenic | 166 | 3.179 | 0.277 | 1.846 | 5.473 | 3.02E-05 | 0.2801 |
rs77988724 | 2:101352758 | G | C | LINC01849;NPAS2 | intergenic | 166 | 0.222 | 0.361 | 0.109 | 0.451 | 3.14E-05 | 0.1958 |
rs9710701 | 2:101344222 | T | C | LINC01849;NPAS2 | intergenic | 166 | 0.222 | 0.361 | 0.109 | 0.451 | 3.14E-05 | 0.1958 |
rs6571759 | 14:37185166 | A | C | SLC25A21 | intronic | 166 | 3.072 | 0.271 | 1.807 | 5.221 | 3.38E-05 | 0.2741 |
rs2994892 | 13:25687237 | G | C | PABPC3;AMER2 | intergenic | 165 | 0.337 | 0.263 | 0.201 | 0.565 | 3.58E-05 | 0.3879 |
rs4984695 | 16:894583 | T | C | PRR25;LMF1 | intergenic | 165 | 0.335 | 0.265 | 0.199 | 0.563 | 3.69E-05 | 0.3697 |
rs1538950 | 10:31042464 | T | C | LINC02644;ZNF438 | intergenic | 166 | 2.927 | 0.262 | 1.752 | 4.887 | 4.06E-05 | 0.3494 |
rs1415914 | 6:141908348 | T | A | MIR4465;NMBR | intergenic | 166 | 2.843 | 0.255 | 1.726 | 4.686 | 4.12E-05 | 0.4428 |
rs1933284 | 6:141911476 | C | T | MIR4465;NMBR | intergenic | 166 | 2.843 | 0.255 | 1.726 | 4.686 | 4.12E-05 | 0.4428 |
rs1415907 | 6:141929466 | C | T | MIR4465;NMBR | intergenic | 166 | 2.835 | 0.255 | 1.721 | 4.669 | 4.27E-05 | 0.4428 |
rs9496098 | 6:141928791 | C | T | MIR4465;NMBR | intergenic | 166 | 2.835 | 0.255 | 1.721 | 4.669 | 4.27E-05 | 0.4428 |
rs9484536 | 6:141930580 | T | G | MIR4465;NMBR | intergenic | 166 | 2.835 | 0.255 | 1.721 | 4.669 | 4.27E-05 | 0.4428 |
rs61949678 | 13:25698133 | A | G | PABPC3;AMER2 | intergenic | 166 | 0.342 | 0.262 | 0.204 | 0.572 | 4.35E-05 | 0.3946 |
rs56090620 | 2:101354489 | G | A | LINC01849;NPAS2 | intergenic | 166 | 0.244 | 0.345 | 0.124 | 0.480 | 4.36E-05 | 0.2048 |
rs72814920 | 2:101348400 | A | G | LINC01849;NPAS2 | intergenic | 166 | 0.244 | 0.345 | 0.124 | 0.480 | 4.36E-05 | 0.2048 |
rs12477035 | 2:101352040 | G | A | LINC01849;NPAS2 | intergenic | 166 | 0.244 | 0.345 | 0.124 | 0.480 | 4.36E-05 | 0.2048 |
rs61181153 | 2:101352393 | A | G | LINC01849;NPAS2 | intergenic | 166 | 0.244 | 0.345 | 0.124 | 0.480 | 4.36E-05 | 0.2048 |
rs61106364 | 2:101352469 | T | C | LINC01849;NPAS2 | intergenic | 166 | 0.244 | 0.345 | 0.124 | 0.480 | 4.36E-05 | 0.2048 |
rs9630089 | 10:98968967 | G | A | ARHGAP19-SLIT1 | RNA_intron | 166 | 3.158 | 0.282 | 1.817 | 5.489 | 4.55E-05 | 0.3584 |
rs9484541 | 6:141989348 | G | T | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs7759161 | 6:141986865 | T | C | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs7755765 | 6:141990258 | C | A | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs12208890 | 6:141984984 | T | C | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs73777235 | 6:141989914 | C | T | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs1338103 | 6:141988466 | T | C | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs9484542 | 6:141991227 | G | A | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs7756112 | 6:141990441 | C | T | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs7740047 | 6:141987400 | G | A | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
rs7756402 | 6:141990439 | G | A | MIR4465;NMBR | intergenic | 166 | 2.764 | 0.250 | 1.694 | 4.508 | 4.66E-05 | 0.4548 |
Abbreviations: GRCh37, chromosome number and base pair position (GRCh37/hg19); A1, effective allele; A2, non-effective allele; OR, odds ratio; LOG(OR)_SE, standard error for logarithm of the odds ratio; L95, lower bound of confidence interval; U95, upper bound of confidence interval; P, P-value; frq.A1, effective allele frequency.
Quantile–quantile plot of GWAS results. The negative logarithm of the observed (Y-axis) and expected (X-axis) P-values are plotted for each SNP. The red line indicates the null hypothesis of no true association (y=x), and the grey region indicates the 95% confidence interval of the red line. The genomic inflation factor (λGC) is shown on the upper left.
Regional plots of −log10(P) are presented for the loci associated with 3 the supernormal coronary arteries. The results of association analysis are shown in the 2-Mbp region surrounding the top variant (purple). Each 4 dot indicates a variant, and different colors represent the linkage disequilibrium (r2) of each variant from the top variant.
The effective allele frequency of lead variants are presented for each group and cohort. Red, dark blue, green, and light blue bars represent the effective allele frequency in the control group, supernormal group (CGC+CMERC-HI), supernormal group from the CGC cohort, and supernormal group from the CMERC-HI cohort, respectively. Standard deviation of the effective allele frequency in two supernormal cohorts are presented as error bar.
Abbreviations: CGC, Cardiovascular Genome Center; CMERC-HI, Cardiovascular and Metabolic Disease Etiology Research Center - High Risk.
Among the 10 GWAS lead variants or their proxies (LD r2>0.8), three were associated with gene expression levels in the coronary artery, whole blood, or adipose tissues from the GTEx project (Fig.2 and Supplemental Table 2). rs9630089, an intronic lead variant in ARHGAP19-SLIT1, was associated with SLIT1 expression in the coronary artery, whole blood, and adipose tissues and with ARHGAP19 expression in adipose tissue. rs6427989, an intronic lead variant in PPFIA4, was associated with PPFIA4 expression in whole blood. rs4984694, an intergenic proxy of a lead variant (rs663580) between PRR25 and LMF1, was associated with METTL26 expression in whole blood. RGS4 was significantly expressed in the coronary artery tissue according to GTEx data (median transcripts per million =18.23), although rs10799925, an intergenic lead variant, and RGS4 expression showed only nominally significant eQTL association (normalized effect size =−0.150; p=0.032).
The figure shows violin plots of normalized gene expression levels (y axis) of GTEx project participants according to genotypes of genome-wide association study (GWAS) lead variants (x axis) at eQTL in the coronary artery, whole blood, or adipose tissues. The p value and normalized effect size (NES) of eQTL analysis from the GTEx data are presented above each violin plot. Numbers of participants according to genotype from GTEx data are presented below the genotype.
rsID | GRCh37 | A1 | A2 | Gencode | Gene | NES | P | Tissue |
---|---|---|---|---|---|---|---|---|
rs6427989 | 1:203019737 | G | A | ENSG00000143847.15 | PPFIA4 | 0.2 | 2.3E-06 | Whole Blood |
rs9630089 | 10:98968967 | G | A | ENSG00000187122.16 | SLIT1 | 0.46 | 3.3E-10 | Artery - Coronary |
rs9630089 | 10:98968967 | G | A | ENSG00000187122.16 | SLIT1 | 0.27 | 7.1E-09 | Adipose - Subcutaneous |
rs9630089 | 10:98968967 | G | A | ENSG00000187122.16 | SLIT1 | 0.23 | 6.8E-06 | Adipose - Visceral (Omentum) |
rs9630089 | 10:98968967 | G | A | ENSG00000213390.10 | ARHGAP19 | -0.14 | 7.9E-06 | Adipose - Subcutaneous |
rs4984694 | 16:891534 | T | C | ENSG00000130731.15 | METTL26 | 0.07 | 8.5E-05 | Whole Blood |
Abbreviations: GRCh37, chromosome number and base pair position (GRCh37/hg19); A1, effective allele; A2, non-effective allele; NES, normalized effect size of the eQTL; P, P-valueofthe eQTL.
In this study, we identified genetic variants suggestive of supernormal coronary arteries by GWAS conducted in individuals with high cardiovascular risk. The 10 identified lead variants were intergenic or intronic and near genes, including PBX1 and PPFIA4; however, their biological link to this protective phenotype has not been elucidated. eQTL analysis showed that associations between three lead variants or proxy variants and expression levels of SLIT1, ARHGAP19, PPFIA4, and METTL26 (near LMF1) in the coronary artery, whole blood, or adipose tissues support their potential biological impact. To the best of our knowledge, these results are the first suggesting genetic loci potentially associated with supernormal coronary arteries that were identified by analyzing individuals with a corresponding phenotype.
The genetic variants identified in this study are associated with supernormal or protected arteries in populations with high Framingham risk scores, representing a composite of diverse risk factors. Therefore, these variants might be related to a protective tendency against either specific risk factors or intermediate biological processes related to certain risk factors. Moreover, some genes near the lead variants have reportedly shown varied biological functions, including angiogenesis or regulation of the peripheral circadian clock22, 23). Further studies may clarify whether these variants or their biological products are significant targets for protection from vascular disease.
PBX1 encoding pre-B-cell leukemia transcription factor 1 was identified as the nearest gene to the first lead variant in the present study. Notably, major allele carriers of the locus showed the supernormal phenotype, and this gene is required for angiogenesis and related transcriptional activity in endothelial cells23). Furthermore, mouse Pbx1 plays a role in kidney vascular patterning by regulating Pdgfrb expression and associations between mural cells and blood vessels24). Additionally, Pbx1d-transgenic hyperlipidemic mice reportedly show elevated autoantibody production, higher counts of T helper cells, and impaired regulatory T cell function along with exacerbated atherosclerosis25). Moreover, loss of the Pbx gene induces misexpression of both vasoconstrictors and vasodilators, affects vascular smooth muscle cells, and causes persistent constriction26). Although studies have reported the effects of PBX1 on activities in multiple vascular cells, its role in the supernormal phenotype remains to be elucidated. RGS4 is located 433-kb away from rs10799925, a lead variant identified in the present study, and encodes a regulator of G protein signaling that plays a key role in cardiovascular functions, including maintenance of vascular tone and heart rate27). Interestingly, RGS4 is atheroprotective, based on its involvement in suppressing angiotensin II-induced inflammatory and atherogenic gene expression through peroxisome proliferator-activated receptor delta activation28). Identification of a lead variant near RGS4 may be in line with the previous findings.
NPAS2, encoding neuronal PAS domain protein 2, was the nearest gene to one of the lead variants identified in this study, and major allele carriers of this locus showed supernormal arteries. NPAS2 is a transcription factor involved in circadian rhythm maintenance in mammals22), with disrupted circadian variation affecting pressor response to external stress and influencing time-dependent cardiovascular events29). Furthermore, Npas2 deletion alters thrombogenicity as well as blood pressure in mice30). Although the relationships between circadian rhythm and NPAS2 remain incompletely understood, this pathway could show a potentially association with supernormal arteries.
The NMBR gene is located in close proximity to a lead variant identified in this analysis and reportedly associated with retinal venular microcirculation31) and plaque morphology observed by intravascular ultrasound32), according to two respective GWAS findings. SLC25A21, which encodes a mitochondrial 2-oxodicarboxylate carrier, was identified near our lead variant, and it was found to be involved in signaling related to smoking-cessation behavior33). Additionally, LMF1, also located near one of the identified lead variants, is associated with severe hypertriglyceridemia and affects lipoprotein lipase function34). However, it remains unclear whether and how either of these two genes are related to vascular protection.
AMER2 gene-encoding APC membrane-recruitment protein 2 is located near one of our lead variants. It is reportedly involved in controlling cell migration35). Furthermore, studies suggest associations between several genes, including PPFIA4 36), identified in the present study and reported in cancer cell biology; however, data concerning plausible mechanisms related to the supernormal phenotype and other genes identified in this study are limited. Nearly all genes close to the lead variants were unrelated to traditional risk factors. Although plausible biological links of the genes to athero-protection have been mentioned above (Fig.3), further studies are needed to clarify their functions.
Plausible biological links of several genes near the lead variants to athero-protection
This study has limitations. First, the current case-control study divided individuals based on disease-resistant versus disease-nonresistant phenotypes rather than a classical grouping of disease-prone versus disease-free individuals. Although the definition of a disease-prone phenotype is clear, definitions of a disease-nonresistant phenotype may vary. In the present study, the control group included individuals with high Framingham risk scores and coronary artery disease, a disease-prone phenotype. Therefore, it is possible that use of a different definition or inclusion of disease-nonresistant individuals could lead to different results. Second, our sample size was relatively small, which limited the statistical power necessary to detect variants at genome-wide significant levels. However, recruitment of individuals with supernormal coronary arteries is difficult due to the limited number of such populations, as well as the limited availability of imaging data. Notably, permutation tests using the same sample size supported the identified associations. Further studies using independent cohorts with larger sample sizes may be helpful in validating the present findings. Nevertheless, the study approach and findings presented here concerning this medically meaningful phenotype are noteworthy. Third, most of the individuals in the supernormal group were enrolled since they had a coronary calcium score of 0. Although this condition indicates very low risk of coronary artery disease, it may not be exactly the same as normal coronary artery. Finally, the supernormal group had higher frequencies of some risk factors. Although it may be interesting that supernormal arteries were maintained in this group in spite of the condition, we cannot predict how these differences would affect our analysis results. It is difficult to completely rule out possible selection bias, as we enrolled the study population from two cohorts. However, we tried our best to overcome such possibility, as mentioned earlier (Supplemental Figs.2 and 5).
In conclusion, we identified genetic variants suggestive of supernormal coronary arteries and their correlated genes in human tissues. These results highlighted genetic loci potentially associated with arteries protected from atherosclerotic coronary disease. Their biological link to the phenotype needs to be examined further.
None.
The authors declare no conflicts of interest.
This research was supported by a National Research Foundation of Korea grant funded by the Korean government (The Ministry of Science and ICT) (2019R1A2C4070496 and 2019R1F1A1057952). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
B.K. analyzed and interpreted the data and wrote the manuscript. C.J.L. analyzed and interpreted the data; wrote the manuscript. H.H.W proposed the study design; acquired the funding; interpreted the data; supervised the study; revised the manuscript. S.H.L. proposed the study design; acquired the funding; interpreted the data; supervised the study; revised the manuscript. The manuscript has been read and approved by all the authors, the requirements for authorship have been met, and the manuscript has been approved for publication by all the authors.