Circulation Journal
Online ISSN : 1347-4820
Print ISSN : 1346-9843
ISSN-L : 1346-9843

この記事には本公開記事があります。本公開記事を参照してください。
引用する場合も本公開記事を引用してください。

Mutation Analysis of the Main Hypertrophic Cardiomyopathy Genes Using Multiplex Amplification and Semiconductor Next-Generation Sequencing
Juan GómezJulian R. RegueroCésar MorísMaría MartínVictoria AlvarezBelén AlonsoSara IglesiasEliecer Coto
著者情報
ジャーナル フリー HTML 早期公開
電子付録

論文ID: CJ-14-0628

この記事には本公開記事があります。
詳細
Abstract

Background: Mutations in at least 30 genes have been linked to hypertrophic cardiomyopathy (HCM). Due to the large size of the main HCM genes, Sanger sequencing is labor intensive and expensive. The purpose was to develop a next-generation sequencing (NGS) procedure for the main HCM genes.

Methods and Results: Multiplex amplification of the coding exons of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1 was designated, followed by NGS with the Ion Torrent PGM (Life Technologies). A total of 8 pools containing DNA from HCM patients were sequenced in a 2-step approach. First, a total of 60 patients (validation cohort) underwent both PGM and Sanger sequencing for the 9 genes. No false-negative variants were found on NGS (100% sensitivity), and a specificity of 97% and 80% was achieved for single-nucleotide and insertion/deletion variants, respectively. Second, the PGM was used to search for mutations in a total of 76 cases not previously studied (discovery cohort). A total of 19 putative mutations were identified in the discovery pools, which were confirmed and assigned to specific patients on Sanger sequencing.

Conclusions: An NGS procedure has been developed for the main sarcomeric genes that would facilitate the screening of large cohorts of patients. In addition, this procedure would facilitate the uncovering of rare gene variants on a population scale.

Mutations in at least 30 different genes have been found in patients with hypertrophic cardiomyopathy (HCM), with MYBPC3 and MYH7 accounting for approximately 50% of the mutations.14 Due to the large size of these genes, the Sanger sequencing of single amplicons is labor intensive and expensive. Next-generation sequencing (NGS) technologies could facilitate the genetic screening of the HCM genes in large cohorts of patients.5,6 Most of the reported NGS procedures are based on the polymerase chain reaction (PCR) amplification of the coding exons from each patient with primers that matched the flanking introns, followed by the pooling and digestion of the PCR products to achieve a readable size (commonly, <200 bp) and the ligation of a specific oligonucleotide (barcode) to each fragment.79 Because each patient can be recognized via the barcode, it is possible to sequence many different patients in a single array. In practice, this means that NGS of a large number of patients would require many different PCR and barcoding assays. One way to reduce the experimental time required and the cost would be to perform multiplex amplification (all the target sequences in a few tubes) of DNA pools. The putative mutations found in a pool could be further assigned to a specific individual by Sanger sequencing of the corresponding exon in all the individuals used to create the pool (Figure 1). In spite of the labor- and cost-saving of this approach, a main limitation of the NGS of DNA pools is that rare nucleotide variants could be diluted by the wild-type allele to a level too low to be detected (false negatives).10 Other authors, however, considered this a valid approach to search for mutations in mendelian disorders.11

Figure 1.

Flow-chart for the next-generation sequencing of DNA pools to characterize mutations in the main hypertrophic cardiomyopathy (HCM) genes.

Editorial p ????

The purpose of this study was to develop and validate a procedure for sequencing the most commonly mutated genes in HCM, based on 2-tube multiplex amplification of DNA pools and NGS with the Ion Torrent semi-conductor (non-optical) Personal Genome Machine (PGM; Life Technologies). This approach would facilitate the rapid and cost-effective search for rare DNA variants in large numbers of individuals.

Methods

Patient Characteristics

This research, including the informed consent forms and procedures, was approved by the Ethics Committee of Hospital Universitario Central Asturias (HUCA). All the patients were Caucasian and from the region of Asturias (Northern Spain) and gave their written informed consent to participate in the study, which was recorded in the patient’s clinical history.

HUCA is a reference center for the genetic studies of HCM in Spain. The study involved a total of 136 HCM non-related index cases, recruited through the Cardiology Department of HUCA in the period 2001–2013. HCM was diagnosed based on clinical symptoms and left ventricular septum (LVS) >15 mm in the absence of any other condition that could explain the hypertrophy (such as hypertension). Patients with at least 1 relative who had also been diagnosed with HCM were defined as familial cases.

Patients were divided into 3 groups (Table 1): (1) validation Sanger to NGS (n=26), previously Sanger sequenced for the coding exons (plus at least 5 intronic flanking nucleotides) of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1;1012 (2) validation NGS to Sanger (n=34), first sequenced on NGS and further via Sanger for the 9 genes; or (3) discovery (n=76), patients not or partly sequenced who underwent NGS followed by Sanger sequencing of the exons containing putative mutations.

Table 1. HCM Patient Characteristics (n=136)
Characteristics Mean±SD or n (%)
Mean age at diagnosis (years) 48±13
 Range 19–76
Male 81 (60)
HCM history 58 (43)
Mean BMI
 Male 26±3
 Female 24±4
Mean IVS 21±5
Mean PWT 12±5
Mean LVWT 32±6
Dyspnea 90 (66)
NYHA index
 Class I–II 63 (46)
 Class III–IV 27 (20)
Angina 47 (35)
Syncope 37 (27)
Atrial fibrillation 31 (23)
Arrhythmia (Holter monitoring) 45 (33)
LVOTO >30 mmHg 49 (36)

BMI, body mass index; HCM, hypertrophic cardiomyopathy; IVS, interventricular septum; LVOTO, left ventricular outflow tract obstruction; LVWT, left ventricular wall thickness; NYHA, New York Heart Association; PWT, posterior wall thickness.

DNA Template Preparation

DNA was obtained following a salting-out method, resuspended in water, and adjusted to a final concentration of 10 ng/µl using Real Time Taqman quantification with RNase P Detection Reagents (FAMTM; Life Technologies) in a 7500 Real Time PCR-System (Applied Biosystems). Using this procedure, we also confirmed that all the DNA were suitable for amplification.

Eight DNA pools containing 10 µl of the corresponding DNA were produced (Table S1): 1 pool consisted of 13 patients from the Sanger to NGS group, and had 1 unique nucleotide variant either mutation or polymorphism (control variants), that would thus be present with an allele frequency of 1/26 in the pool. Three pools (2–4) with 12–16 samples per pool consisted of 4 patients from the Sanger to NGS group who harbored a mutation (control variants) plus 34 patients from the NGS to Sanger group. Finally, 4 pools (A–D) with 20–25 samples per pool consisted of 12 patients from the Sanger to NGS group, plus 76 patients from the discovery group.

Multiplex (Ampliseq) Amplification

A 2-tube multiplex amplification for the coding sequence exons plus at least 5 intronic flanking nucleotides (approximately 16 kb) of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1 genes was designated online (Ion AmpliSeqTM Designer; https://www.ampliseq.com). We compared several primer design options and ordered the 1 that gave the maximum target sequence coverage. Primer pairs to amplify a total of 176 fragments were provided by the manufacturer in only 2 tubes. The amplicons covered 99% of the target sequence (Tables S2,S3).

Each DNA pool was amplified with the Ion AmpliseqTM Library Kit in conjunction with Ion AmpliseqTM Custom Primer Pool protocols according to the manufacturer procedures (Life Technologies), and following the next steps: PCR in 2 tubes, partial digestion of the primers with FuPa Reagent, ligation of the barcode adapters (only for the discovering cohort pools), purification by Agencourt® AMPure® XP Reagent, PCR with the adapters using Platinum® PCR SuperMix High Fidelity enzyme (Invitrogen), purification by Agencourt® AMPure® XP Reagent, quantification of the sample (Agilent Bioanalyser Instrument and Qubit® 2.0 Fluorometer), and dilution of the sample to a final concentration of 20 pmol/L.

Template preparation, emulsion PCR, emulsion breaking, and enrichment were performed using the Ion PGMTM Template OT2 200 kit following the manufacturer instructions (Life Technologies). Briefly, a total of 10 ng of the DNA pool were amplified in 2 Ampliseq tubes using the Ion AmpliseqTM Library Kit. The reactions were quantified (Agilent Bioanalyzer) and then emulsion PCR was done using the Ion PGM template OT2 200 Kit and the Ion One-Touch instrument (Life Technologies). Template-positive spheres were recovered using Dynabeads MyOne Streptavidin C1 beads and quantified using the Ion SphereTM Quality Control Assay and the Qubit 2.0 fluorometer (Life Technologies).

We performed 3 massive parallel sequencing experiments: pool 1 was sequenced with an Ion Torrent 316 (100-Mb) array. Each of the NGS to Sanger (2–4) and discovery (A–D) pools were individually barcoded and sequenced in two 318 (1,000-Mb) arrays.

NGS was performed using the PGM 200 sequencing kit protocol in the Ion Torrent PGM. We used 260-flow runs, which support a template read length of approximately 200 bp. The number of samples used to create the pools was decided taking into account the load capacity of the array, the total length of the target sequences (approximately 16 kb), the dilution of a unique rare allele inside the pool, and the number of reads per amplicon necessary to achieve a theoretical minimum 50× coverage.

NGS Data Analysis

The raw PGM data were processed with Torrent Suite v3.4.2 (Life Technologies) to generate sequence reads filtered by the pipeline software quality controls. Reads assembling and variant identification were done with Variant Caller (VC) v3.4.51874, using FastQ files containing sequence reads and the Ion Ampliseq Designer BED file software to map the amplicons. The Integrative Genome Viewer (IGV, Broad Institute) was used for the analysis of depth coverage, sequence quality, and variant identification. Variants were identified with the somatic sample VC default algorithm. We considered 3 types of reads: <20× coverage per allele, amplicons were discarded, and the corresponding exons Sanger sequenced in each individual; 20–50× coverage per allele, amplicons were considered admissible and the BAM files were visualized to confirm the read quality and confirm the nucleotide variants; and >50× coverage per allele, amplicons were considered optimal.

Because insertions/deletions (INDELS) are frequently non-detected by the PGM (and other NGS platforms) we performed a specific analysis to reduce the risk for non-detection of true INDELS (false negative). The somatic sample VC default algorithm was set to low sample coverage, minimum allele frequency and minimum variant frequencies of 10,000, 0.01, and 0.01 respectively. In addition, the BAM files of amplicons containing putative INDELS were visualized and those that mapped in only 1 strand were discarded.

Sanger Sequencing and Putative Mutation Assignment

Nucleotide variants that fulfilled the following criteria were considered as putative mutations: had a functional effect (missense, nonsense or frameshifting amino acid changes; pre-mRNA splicing); reported in the Human Genome Mutation database; or classified as likely pathogenic on bioinformatics in silico prediction (Polyphen and SIFT). For each putative mutation in the 7 discovery pools the corresponding DNA were individually amplified and sequenced with BigDye chemistry using ABI3130 equipment (Life Technologies) to identify the mutation carrier. Briefly, the exon containing the nucleotide variant was amplified with primers that matched the flanking introns and PCR fragments were purified and sequenced.13,14

Results

Validation of Ampliseq HCM Genes

A DNA pool-based strategy was carried out, using custom target multiplex amplification for the main HCM genes (MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1) in only 2 tubes, followed by massive parallel sequencing in the Ion Torrent PGM sequencer. As a first validation step, a pool of DNA from 13 patients (pool 1) was amplified and PGM sequenced in the medium-capacity (100-Mb) 316 semiconductor chip. By processing a single pool we reduced the cost and labor time of individual amplification, barcoding, and library preparation of the 13 DNA.

The array load density was 73% with 344 Mb of nucleotide reads, a mean read length of 138 bp, and 98% of the amplicons having >100× coverage (Table S4). A total of 172 of the 176 amplicons (98%) had optimal reads, and only 4 (2%) had null or poor reads (Figure 2). Among the non-readable amplicons, 2 corresponded to exon 1 of TPM1, 1 to exon 12 of MYBPC3, and 1 to exon 6 of ACTC1. Because these failures were replicated in the 318 arrays and the sequences of the primer pairs were correct, we concluded that the absence of nucleotide reads for the 4 amplicons was likely due to some characteristic that made them refractory to amplification, such as a high GC content (3 of the 4 amplicons had a GC content >60%; Table S5).

Figure 2.

Nucleotide reads for the amplicons of the 9 genes.

The VC identified a total of 45 single-nucleotide variants (SNV) in the validation pool 1 (Table S6). All the control variants in readable amplicons were detected at a threshold frequency >1% (allele frequency range, 3.35–5.42; Table 2). The VC also identified all the variants in readable amplicons previously found through Sanger sequencing of the 13 patients. We identified 2 MYH7 variants non-recognized in the Sanger sequencing: c.T136>C and c.T5345>A. These nucleotide changes were thus false positives. On visual inspection of the BAM files using IGV, c.T136>C overlapped 2 amplicons, and most of the c.5345A were in forward strand sequences (Figure S1).

Table 2. Rare Variants in Sanger to NGS Validation Pool (n=13)
Gene Nucleotide
position
Exon/Intron cDNA Effect Rare variant
(%)
MYH7 23898994 Exon 12 c.1128C>T p.D376D 4.02
MYH7 23900093 Intron 10 c.895+17G>A None 4.43
MYBPC3 47358997 Exon 24 c.2547C>T p.V849V 4.01
MYBPC3 47360053 Intron 22 c.2308+18C>G None 3.54
MYBPC3 47365049 Exon 13 c.1217G>A p.S406N 3.76
MYBPC3 47370076 Exon 6 c.671_673delTGC p.L224fs 3.35
MYBPC3 47371598 Exon 4 c.472G>A p.V158M 3.85
TNNT2 201330429 Exon 14 c.758A>G p.K260R 4.49
TNNI3 55667958 Intron 3 c.150+13G>A None 4.88
TNNI3 55668992 Exon 1 c.–35C>A None 5.42
MYL2 111353556 Exon 3 c.132T>C p.I44I 3.78
MYL3 46902491 Intron 1 c.130–14G>T None 3.89
TPM1 63353098 Exon 5 c.523G>A p.D175N 5.13

NGS, next-generation sequencing.

The VC identified the only INDEL control variant (MYBPC3 c.671_673delTGC; Table 2). In addition, VC identified 10 INDELS but only c.53-11_53-7delCTTCTT in TNNT2 was found in the Sanger sequencing (Table S7). On inspection using IGV, the 9 false positives were false positives that mapped in only 1 strand or in homopolymer regions longer than 5 nucleotides.

Sensitivity and Specificity

To extend the validation, pools 2–4 containing a total of 37 HCM patients underwent NGS followed by Sanger sequencing of the 9 genes. The main data of the Ion PGM runs are summarized in Table S4. In the 3 pools, we replicated the 4 sequencing failures (<20× coverage) previously observed in validation pool 1. The VC identified a total of 52, 56, and 49 SNV in the 3 pools, respectively (Tables S8S10). We excluded the occurrence of false negative SNV on Sanger sequencing of the 9 genes in all patients used to create the pools. Moreover, we identified a total of 49 variants that were either mutations or polymorphisms seen in only 1 sample inside the corresponding pool (Table 3). Together, these data confirmed the accuracy of the method in avoiding false negatives. The VC also identified the 2 false-positive SNV previously found in validation pool 1. With regard to the INDELS, no false negatives and 3 false positives were found (Tables S11S13).

Table 3. Rare Variants in NGS to Sanger Validation Pools (n=37)
Gene Position Exon/Intron cDNA Effect Frequency DNA pool
MYH7 23886055 Intron 33 c.4644+22G>A None 4.5 2
MYH7 23886064 Intron 33 c.4644+12_4644+13delTG None 4.8 1
MYH7 23886064 Intron 33 c.4644+12_4644+13delTG None 4.2 2
MYH7 23886155 Exon 33 c.4544T>C p.T1522T 2.6 2
MYH7 23886264 Intron 32 c.4520–63G>A None 3.4 1
MYH7 23886504 Exon 32 c.4377G>T p.K1459N 2.8 1
MYH7 23888371 Intron 29 c.3972+15C>T None 3.2 3
MYH7 23892799 Exon 24 c.3062C>A p.T1019N 4.0 2
MYH7 23892950 Intron 23 c.2923–18G>A None 4.1 3
MYH7 23893034 Intron 23 c.2922+82C>T None 4.2 3
MYH7 23893995 Exon 22 c.2662C>A p.Q888K 3.3 3
MYH7 23897077 Exon 16 c.1605A>G p.E535E 4.1 3
MYH7 23898994 Exon 12 c.1128C>T p.D376D 3.9 1
MYH7 23898994 Exon 12 c.1128C>T p.D376D 4.3 2
MYH7 23898994 Exon 12 c.1128C>T p.D376D 4.7 3
MYH7 23899027 Exon 12 c.1095G>A p.K365K 2.9 1
MYH7 23899038 Exon 12 c.1084G>A p.M362V 2.8 1
MYH7 23899793 Exon 11 c.975C>T p.D325D 4.5 2
MYH7 23900093 Intron 10 c.895+17G>A None 4.5 1
MYH7 23901012 Exon 7 c.597A>G p.A199A 4.6 1
MYH7 23901012 Exon 7 c.597A>G p.A199A 3.6 2
MYH7 23901922 Exon 5 c.428G>A p.R143Q 4.0 2
MYBPC3 47356615 Exon 26 c.2883G>A p.P961P 4.5 3
MYBPC3 47357416 Intron 25 c.2737+12C>T None 5.7 2
MYBPC3 47358997 Exon 24 c.2547C>T p.V849V 3.3 2
MYBPC3 47359014 Exon 24 c.2531_2532 insGA p.M844fs 5.5 1
MYBPC3 47359014 Exon 24 c.2531_2532 insGA p.M844fs 5.3 3
MYBPC3 47360053 Intron 22 c.2308+18C>G None 3.4 1
MYBPC3 47360053 Intron 22 c.2308+18C>G None 4.9 2
MYBPC3 47360133 Exon 22 c.2246G>A p.Y749C 5.5 1
MYBPC3 47364138 Exon 18 c.1615A>G p.I539V 3 1
MYBPC3 47362642 Intron 18 c.1847+47G>A None 2.8 2
MYBPC3 47362642 Intron 18 c.1847+47G>A None 5.2 3
MYBPC3 47364129 Exon 16 c.1624G>C p.E542Q 4.4 3
MYBPC3 47364248 Exon 16 c.1505G>A p.R502Q 2.6 3
MYBPC3 47364975 Intron 13 c.1223+68C>T None 2.8 3
MYBPC3 47367823 Exon 12 c.1025T>A p.V342D 2.6 1
MYBPC3 47369443 Exon 7 c.786C>T p.T262T 3.6 3
MYBPC3 47370037 Exon 6 c.710A>G p.Y237C 3.6 1
MYBPC3 47370041 Exon 6 c.706A>G p.S236G 3.3 3
MYBPC3 47370074 Exon 6 c.671_673delTGC p.L224fs 4.4 2
MYBPC3 47370074 Exon 6 c.671_673delTGC p.L224fs 4.6 3
MYBPC3 47371598 Exon 4 c.472G>A p.V158M 5.7 1
MYBPC3 47371598 Exon 4 c.472G>A p.V158M 4.2 3
TNNT2 201328272 Exon 16 c.*66G>A None 3 2
TNNT2 201330429 Exon 14 c.758A>G p.K260R 4.5 2
TNNI3 55665410 Exon 6 c.537G>A p.E179E 4.8 3
TNNI3 55668397 Intron 2 c.180+21G>A None 4 3
MYL2 111351974 Intron 4 c.274+16_274+17insCT 2.2 None 3
MYL2 111353556 Exon 3 c.132T>C p.I46I 3.1 2
MYL3 46901019 Exon 4 c.427G>A p.E143K 3.7 1
MYL3 46902491 Intron 1 c.130–14G>T None 3.7 1
TPM1 63335074 Exon 1 c.46G>C p.E16Q 3.4 3
TPM1 63353451 Exon 6 c.689+313A>G p.A216A 2.7 1
TPM1 63356237 Intron 8 c.898+1393C>T None 5.5 1
TPM1 63358033 Intron 9 c.898+3189delT None 5.2 1
TPM1 63356331 Exon 9 c.841A>G p.M281V 3 2
ACTC1 35083251 Intron 6 c.990+64C>T None 3.8 2

Known control variants. NGS, next-generation sequencing.

A total of 60 patients (pools 1–4) underwent both NGS and Sanger sequencing, with 100% sensitivity of the NGS (no false-negative variants). With regard to the specificity, there were only 2 false-positive SNV (MYH7 c.T136>C and c.T5345>A). As expected, the number of false positives was higher for the INDELS: a total of 20 fulfilled the criteria for being deconvoluted and only 4 were false positives (80% specificity).

PGM of the Discovery Pools

After determining the accuracy to detect rare variants in the validation pools, we sequenced a total of 86 patients in 4 discovery pools (A–D; 20–25 samples per pool). In addition to patients not or partly Sanger sequenced (and negative for mutations; n=76), each pool also contained the DNA from 2–4 patients with known mutations. The main data of the Ion PGM runs are summarized in Table S4. We replicated the 4 NGS failures (<20× coverage) previously observed in the validation pools. Thus, exons 1 of TPM1, 12 of MYBPC3, and 6 of ACTC1 should be Sanger sequenced in all patients as part of mutation screening.

At a 1% threshold the VC identified all the known control mutations in the 4 pools (Table S14; an excel file with all the variants is available upon request from the corresponding author). We also confirmed and assigned to specific patients through Sanger sequencing of the corresponding exon a total of 19 rare nucleotide changes that were classified as probably damaging or variants of uncertain effect (Figure 3; Table 4; Table S15). We found at least 1 putative mutation in 17 of the 76 patients. Two patients were carriers of 2 different variants (MYBPC3 p.A261T+p.E218K, and MYH7 p.L620P+p.K1459N). We also excluded the presence of additional patients with any of the control variants by sequencing the corresponding exon from all the patients in each pool. The VC also reported the 2 false MYH7 SNV in the 4 discovery pools.

Figure 3.

Integrative Genome Viewer and Sanger electropherogram of variants found in the discovery pools, including a G insertion (p.V931fs) in MYBPC3, and nucleotide changes (p.E143Q and p.Y749C) in MYL3 and MYBPC3.

Table 4. Rare Variants, Putative Mutations, in the Discovery Pools
Gene Nucleotide
position
Exon/Intron cDNA Effect ESP
frequency
HGMD HCM
history
SIFT Polyphen
MYH7 23886504 Exon 32 c.4377G>T p.K1459N 1/4300 Yes No Damaging Probably damaging
MYH7 23895023 Exon 20 c.2167C>G p.R723C No Yes No Damaging Probably damaging
MYH7 23896042 Exon 18 c.1988G>A p.R663H 1/4300 Yes Yes Damaging Possibly damaging
MYH7 23896823 Exon 16 c.1859T>C p.L620P No No No Damaging Probably damaging
MYH7 23902931 Exon 3 c.11C>T p.S4L No Yes No Tolerated Benign
MYBPC3 47355475 Exon 27 c.2992C>G p.Q998E No Yes Yes Damaging Probably damaging
MYBPC3 47359046 Exon 24 c.2498C>T p.A833V 2/4265 Yes No Tolerated Probably damaging
MYBPC3 47360133 Exon 22 c.2246G>A p.Y749C No No No Damaging Probably damaging
MYBPC3 47367816 Exon 12 c.1032C>A p.D344E No No No Damaging Possibly damaging
MYBPC3 47369442 Exon 7 c.787G>T p.G263X No No Yes
MYBPC3 47369975 Exon 6 c.772G>A p.E258K No Yes Yes Damaging Possibly damaging
MYBPC3 47371333 Exon 5 c.646G>A p.A216T 1/4204 Yes Yes Tolerated Benign
MYBPC3 47371414 Exon 5 c.565G>A p.V189I 29/4208 No No Tolerated Benign
MYBPC3 47371619 Exon 4 c.451G>A p.D151N No No No Tolerated Benign
MYBPC3 47373032 Exon 2 c.50G>A p.R17Q 1/4204 No No Tolerated Possibly damaging
TNNT2 201328348 Exon 16 c.848G>A p.R283H 1/4299 Yes Yes Damaging Probably damaging
TNNT2 201328373 Exon 16 c.823C>T p.R275C 5/4299 Yes No Damaging Probably damaging
TNNI3 55665463 Exon 6 c.484C>T p.R162W 1/4300 Yes Yes Damaging Probably damaging
ACTC1 35085599 Exon 3 c.301G>A p.E101>K No Yes No Damaging Probably damaging

Previously non-reported at Ensembl (www.ensembl.org, Release 74). HCM, hypertrophic cardiomyopathy; HGMD, human gene mutation database.

The 2 INDEL control variants in the discovery pools were successfully detected. In addition, a total of 13 putative INDELS with possible functional effect were identified. After applying the analysis parameters we concluded that none of the 13 putative INDELS fulfilled the quality criteria to be considered true: they mapped in only 1 strand, were present at a high frequency in the 2 arrays, and none of them was reported in the gene variation databases. Moreover, to validate the procedure we Sanger sequenced the patients in the corresponding pools and confirmed that all were false positives.

Because the 4 discovery pools were composed of 20–25 patients while the validation pools contained 12-16 different DNA, we Sanger sequenced the 9 genes in all the patients from the largest discovery pool to confirm that there were no false negatives when a larger number of patients was included in a pool. Pool A contained (in addition to 2 control samples) DNA from 23 patients who had been partly sequenced for the MYH7, MYBPC3, TNNT2, TPM1, and TNNI3. After completing the Sanger sequencing of the 9 genes in these cases, we confirmed the absence of SNV not identified in the NGS and the only 2 false-positive MYH7 changes (Table S16). In reference to the INDELS, we also confirmed the absence of false negatives, and only the TNNT2 c.136-49_136-48insA (previous identified in other pools) false positive was identified (Table S17).

Discussion

The Ion Torrent PGM is a semiconductor (instead of optical) sequencer.1517 The reported PGM procedures are based on the PCR amplification of DNA from single patients followed by pooling and barcode labeling of each patient’s fragments and NGS sequencing.7,18,19 We developed a procedure to perform multiplex amplification custom-Ampliseq designated to amplify the main HCM genes in only 2 tubes. This procedure avoids the necessity of multiple amplifications per patient, but at the cost of poor of no amplification for some of the exons. Only 4 out of the 176 amplicons failed to amplify and give sequence reads. Although the corresponding exons should be Sanger sequenced from each patient as part of the mutation screening, we consider that this represents a minimum cost compared to the advantage of amplifying >98% of the amplicons in only 2 tubes (Figure 1). It could be argued that the multiplex amplification might be optimized by redesigning the primer pairs for the non-readable amplicons. The maximum read capacity of the PGM (and other NGS procedures), however, is currently limited to approximately 200 bp, and this results in practical constraints in designing PCR primers around the targeted exons, specially when they are embedded in GC-rich regions. This strategy of multiplex amplification followed by PGM sequencing has already been tested in a HCM cohort.20 Compared to the present study, however, all the patients were previously Sanger sequenced, the covered genes were not the same, and the authors amplified each patient’s DNA individually.

In addition to amplifying all the target exons in only 2 tubes, we also sequenced DNA pools. This approach would also reduce the cost of processing single individuals, a fact that would reduce the cost of screening large numbers of individuals. The amplification of single fragments from DNA pools has been used to re-sequence and discover rare variants linked to common diseases, as well to uncover mutations in mendelian disorders.2123 Although DNA-pools have been successfully sequenced with other NGS platforms, some authors concluded that barcoding of individual samples before pooling (rather than a genomic DNA pooling strategy) is preferred to avoid false negatives.10 If a fragment was not amplified from a particular DNA in the pool, the patient should be wrongly classified as a non-mutation carrier. We think this was unlikely in the present study because we excluded the occurrence of false negatives in 60 patients who were both Sanger and NGS sequenced for the 9 genes. Also, all the pools used high-quality DNA previously assayed and quantified through Taqman assays. In addition, the power to detect rare mutations could be increased by sequencing overlapping population pools in which each individual occurs in 2 pools.24 One of the main limitations of the PGM (and other NGS platforms) is the presence of false positives, mainly in homopolymer regions >4 nucleotides. With the present VC criteria the specificity for INDELS was 80%, with a sensitivity of 100% in detecting all the true variants inside the pools.

Once validated, we searched for mutations in a total of 76 patients distributed in 4 pools. In addition to new cases, we also included patients who had been partially sequenced or studied through indirect techniques (such as single-strand conformation analysis). We successfully identified the control variants in these discovery pools, and also found a total of 19 rare nucleotide changes that were classified as probably damaging or variants of uncertain effect. All them were confirmed and assigned to 17 patients via Sanger sequencing of the corresponding exon. Two patients were double mutation carriers, a condition that has been linked to poor prognosis.25 In a total of 11 of these index cases we performed family studies and identified a total of 26 mutation carriers (data not shown).

The validation pools contained fewer samples than the discovery pools, and it might be argued that this could affect the results in terms of lower specificity and sensitivity when larger pools undergo NGS. We think this was unlikely in the present study, because the 9 genes were Sanger sequenced in all the 25 patients from the largest pool and we did not find false-negative variants: they were the same false positives as in the validation pools. Although we were able to identify unique nucleotide variants in a pool of 25 individuals (1/50 alleles), the lower level of detection was not identified. Thus, it is possible that rare unique variants could be read over the sequencing noise level in even larger pools.

An issue to consider when searching for rare variants in DNA pools is the number of Sanger sequences required to characterize the mutation carriers in the pool. Among the 4 discovery pools, the larger number of mutations (n=6) was found in pool D, containing 22 patients. Thus, a total of 132 PCR fragments were amplified and 264 Sanger reads (forward+reverse strands) generated. In comparison, a total of 128 amplicons+5,632 sequence reads would be required for the Sanger sequencing of the 9 genes in the 22 patients. In the case that the number of putative mutations equals the number of samples (N) in the pool and all the mutations are in different amplicons, a total of N2 single amplicons would need to be Sanger sequenced. In this way, to identify all the putative mutations would be impractical in cost and labor time for large DNA pools. In this study we created 8 libraries to sequence a total of 136 HCM patients for the price of only approximately €2,400 in library preparation. If we had performed this study using individual barcode samples, the cost of library preparation would have risen to €40,800, 17-fold greater. In addition, 2 technicians carried out the NGS and Sanger assignment of the putative mutations in the 76 patients from the discovery pools in only 4 weeks, a time much shorter than that required to sequence the 9 genes in all the patients on Sanger sequencing.

Finally, NGS technology has generated high-throughput sequencing data in genes linked to cardiomyopathies, in patients and in apparently healthy individuals.6,9 The Exome Sequencing project found rare variants (ESP database, http://evs.gs.washington.edu) previously identified in HCM patients, a fact that has questioned the pathogenicity of some missense nucleotide changes previously considered as mutations.26,27 At this stage, the present procedure would facilitate rapid and cost-effective sequencing of the main HCM-associated genes in large sets of individuals.

Conclusions

We report the massive parallel sequencing of the genes most commonly mutated in HCM patients. The present procedure would facilitate the rapid and cost-effective screening of these genes on a population scale.

Acknowledgments

This work was supported by a grant from Instituto de Salud Carlos III-Fondo Europeo de Desarrollo Regional (FIS-12/00287; RD12/0021/0012).

Disclosures

The authors declare no conflict of interest.

Authors Contribution: J.R.R., C.M., M.M., recruited the patients and performed the clinical and echographic studies; J.G., V.A., B.A., S.I., and E.C. performed the genetic studies; J.G. and E.C. wrote the ms. All the authors have seen and approved the final version of the ms.

Supplementary Files

Supplementary File 1

Table S1. Samples used to create the 8 pools

Table S2. Ampliseq amplicons and coverage details

Table S3. Ampliseq for the 9 genes

Table S4. Ion torrent PGM run characteristics (n=3)

Table S5. GC content in 4 unreadable amplicons

Table S6. Rare SNV in sanger to NGS validation pool 1 (n=13)

Table S7. INDELS in sanger to NGS validation pool 1

Table S8. Rare SNV in NGS to sanger validation pool 2

Table S9. Rare SNV in NGS to sanger validation pool 3

Table S10. Rare SNV in NGS to sanger validation pool 4

Table S11. INDELS variants in NGS to sanger validation pool 2

Table S12. INDELS variants in NGS to sanger validation pool 3

Table S13. INDELS variants in NGS to sanger validation pool 4

Table S14. Control variants in discovery pools

Table S15. Rare variants in discovery pools A–D

Table S16. Rare SNV in discovery pool A

Table S17. INDELS variants in discovery pool A

Figure S1. Examples of de Integrative Genome Viewer for falsepositive single-nucleotide variant and indels, with the corresponding Sanger sequence.

Please find supplementary file(s);

http://dx.doi.org/10.1253/circj.CJ-14-0628

References
 
© 2014 THE JAPANESE CIRCULATION SOCIETY
feedback
Top