Mutation Analysis of the Main Hypertrophic Cardiomyopathy Genes Using Multiplex Amplification and Semiconductor Next-Generation Sequencing

Juan Gómez; Julian R. Reguero; César Morís; María Martín; Victoria Alvarez; Belén Alonso; Sara Iglesias; Eliecer Coto

doi:10.1253/circj.CJ-14-0628

Abstract

Background: Mutations in at least 30 genes have been linked to hypertrophic cardiomyopathy (HCM). Due to the large size of the main HCM genes, Sanger sequencing is labor intensive and expensive. The purpose was to develop a next-generation sequencing (NGS) procedure for the main HCM genes.

Methods and Results: Multiplex amplification of the coding exons of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1 was designated, followed by NGS with the Ion Torrent PGM (Life Technologies). A total of 8 pools containing DNA from HCM patients were sequenced in a 2-step approach. First, a total of 60 patients (validation cohort) underwent both PGM and Sanger sequencing for the 9 genes. No false-negative variants were found on NGS (100% sensitivity), and a specificity of 97% and 80% was achieved for single-nucleotide and insertion/deletion variants, respectively. Second, the PGM was used to search for mutations in a total of 76 cases not previously studied (discovery cohort). A total of 19 putative mutations were identified in the discovery pools, which were confirmed and assigned to specific patients on Sanger sequencing.

Conclusions: An NGS procedure has been developed for the main sarcomeric genes that would facilitate the screening of large cohorts of patients. In addition, this procedure would facilitate the uncovering of rare gene variants on a population scale.

Mutations in at least 30 different genes have been found in patients with hypertrophic cardiomyopathy (HCM), with MYBPC3 and MYH7 accounting for approximately 50% of the mutations.¹^–⁴ Due to the large size of these genes, the Sanger sequencing of single amplicons is labor intensive and expensive. Next-generation sequencing (NGS) technologies could facilitate the genetic screening of the HCM genes in large cohorts of patients.⁵^,⁶ Most of the reported NGS procedures are based on the polymerase chain reaction (PCR) amplification of the coding exons from each patient with primers that matched the flanking introns, followed by the pooling and digestion of the PCR products to achieve a readable size (commonly, <200 bp) and the ligation of a specific oligonucleotide (barcode) to each fragment.⁷^–⁹ Because each patient can be recognized via the barcode, it is possible to sequence many different patients in a single array. In practice, this means that NGS of a large number of patients would require many different PCR and barcoding assays. One way to reduce the experimental time required and the cost would be to perform multiplex amplification (all the target sequences in a few tubes) of DNA pools. The putative mutations found in a pool could be further assigned to a specific individual by Sanger sequencing of the corresponding exon in all the individuals used to create the pool (Figure 1). In spite of the labor- and cost-saving of this approach, a main limitation of the NGS of DNA pools is that rare nucleotide variants could be diluted by the wild-type allele to a level too low to be detected (false negatives).¹⁰ Other authors, however, considered this a valid approach to search for mutations in mendelian disorders.¹¹

Figure 1.

Flow-chart for the next-generation sequencing of DNA pools to characterize mutations in the main hypertrophic cardiomyopathy (HCM) genes.

Editorial p ????

The purpose of this study was to develop and validate a procedure for sequencing the most commonly mutated genes in HCM, based on 2-tube multiplex amplification of DNA pools and NGS with the Ion Torrent semi-conductor (non-optical) Personal Genome Machine (PGM; Life Technologies). This approach would facilitate the rapid and cost-effective search for rare DNA variants in large numbers of individuals.

Methods

Patient Characteristics

This research, including the informed consent forms and procedures, was approved by the Ethics Committee of Hospital Universitario Central Asturias (HUCA). All the patients were Caucasian and from the region of Asturias (Northern Spain) and gave their written informed consent to participate in the study, which was recorded in the patient’s clinical history.

HUCA is a reference center for the genetic studies of HCM in Spain. The study involved a total of 136 HCM non-related index cases, recruited through the Cardiology Department of HUCA in the period 2001–2013. HCM was diagnosed based on clinical symptoms and left ventricular septum (LVS) >15 mm in the absence of any other condition that could explain the hypertrophy (such as hypertension). Patients with at least 1 relative who had also been diagnosed with HCM were defined as familial cases.

Patients were divided into 3 groups (Table 1): (1) validation Sanger to NGS (n=26), previously Sanger sequenced for the coding exons (plus at least 5 intronic flanking nucleotides) of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1;¹⁰^–¹² (2) validation NGS to Sanger (n=34), first sequenced on NGS and further via Sanger for the 9 genes; or (3) discovery (n=76), patients not or partly sequenced who underwent NGS followed by Sanger sequencing of the exons containing putative mutations.

Table 1. HCM Patient Characteristics (n=136)

Characteristics	Mean±SD or n (%)
Mean age at diagnosis (years)	48±13
Range	19–76
Male	81 (60)
HCM history	58 (43)
Mean BMI
Male	26±3
Female	24±4
Mean IVS	21±5
Mean PWT	12±5
Mean LVWT	32±6
Dyspnea	90 (66)
NYHA index
Class I–II	63 (46)
Class III–IV	27 (20)
Angina	47 (35)
Syncope	37 (27)
Atrial fibrillation	31 (23)
Arrhythmia (Holter monitoring)	45 (33)
LVOTO >30 mmHg	49 (36)

BMI, body mass index; HCM, hypertrophic cardiomyopathy; IVS, interventricular septum; LVOTO, left ventricular outflow tract obstruction; LVWT, left ventricular wall thickness; NYHA, New York Heart Association; PWT, posterior wall thickness.

DNA Template Preparation

DNA was obtained following a salting-out method, resuspended in water, and adjusted to a final concentration of 10 ng/µl using Real Time Taqman quantification with RNase P Detection Reagents (FAM^TM; Life Technologies) in a 7500 Real Time PCR-System (Applied Biosystems). Using this procedure, we also confirmed that all the DNA were suitable for amplification.

Eight DNA pools containing 10 µl of the corresponding DNA were produced (Table S1): 1 pool consisted of 13 patients from the Sanger to NGS group, and had 1 unique nucleotide variant either mutation or polymorphism (control variants), that would thus be present with an allele frequency of 1/26 in the pool. Three pools (2–4) with 12–16 samples per pool consisted of 4 patients from the Sanger to NGS group who harbored a mutation (control variants) plus 34 patients from the NGS to Sanger group. Finally, 4 pools (A–D) with 20–25 samples per pool consisted of 12 patients from the Sanger to NGS group, plus 76 patients from the discovery group.

Multiplex (Ampliseq) Amplification

A 2-tube multiplex amplification for the coding sequence exons plus at least 5 intronic flanking nucleotides (approximately 16 kb) of MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1 genes was designated online (Ion AmpliSeq^TM Designer; https://www.ampliseq.com). We compared several primer design options and ordered the 1 that gave the maximum target sequence coverage. Primer pairs to amplify a total of 176 fragments were provided by the manufacturer in only 2 tubes. The amplicons covered 99% of the target sequence (Tables S2,S3).

Each DNA pool was amplified with the Ion Ampliseq^TM Library Kit in conjunction with Ion Ampliseq^TM Custom Primer Pool protocols according to the manufacturer procedures (Life Technologies), and following the next steps: PCR in 2 tubes, partial digestion of the primers with FuPa Reagent, ligation of the barcode adapters (only for the discovering cohort pools), purification by Agencourt^® AMPure^® XP Reagent, PCR with the adapters using Platinum^® PCR SuperMix High Fidelity enzyme (Invitrogen), purification by Agencourt^® AMPure^® XP Reagent, quantification of the sample (Agilent Bioanalyser Instrument and Qubit^® 2.0 Fluorometer), and dilution of the sample to a final concentration of 20 pmol/L.

Template preparation, emulsion PCR, emulsion breaking, and enrichment were performed using the Ion PGM^TM Template OT2 200 kit following the manufacturer instructions (Life Technologies). Briefly, a total of 10 ng of the DNA pool were amplified in 2 Ampliseq tubes using the Ion Ampliseq^TM Library Kit. The reactions were quantified (Agilent Bioanalyzer) and then emulsion PCR was done using the Ion PGM template OT2 200 Kit and the Ion One-Touch instrument (Life Technologies). Template-positive spheres were recovered using Dynabeads MyOne Streptavidin C1 beads and quantified using the Ion Sphere^TM Quality Control Assay and the Qubit 2.0 fluorometer (Life Technologies).

We performed 3 massive parallel sequencing experiments: pool 1 was sequenced with an Ion Torrent 316 (100-Mb) array. Each of the NGS to Sanger (2–4) and discovery (A–D) pools were individually barcoded and sequenced in two 318 (1,000-Mb) arrays.

NGS was performed using the PGM 200 sequencing kit protocol in the Ion Torrent PGM. We used 260-flow runs, which support a template read length of approximately 200 bp. The number of samples used to create the pools was decided taking into account the load capacity of the array, the total length of the target sequences (approximately 16 kb), the dilution of a unique rare allele inside the pool, and the number of reads per amplicon necessary to achieve a theoretical minimum 50× coverage.

NGS Data Analysis

The raw PGM data were processed with Torrent Suite v3.4.2 (Life Technologies) to generate sequence reads filtered by the pipeline software quality controls. Reads assembling and variant identification were done with Variant Caller (VC) v3.4.51874, using FastQ files containing sequence reads and the Ion Ampliseq Designer BED file software to map the amplicons. The Integrative Genome Viewer (IGV, Broad Institute) was used for the analysis of depth coverage, sequence quality, and variant identification. Variants were identified with the somatic sample VC default algorithm. We considered 3 types of reads: <20× coverage per allele, amplicons were discarded, and the corresponding exons Sanger sequenced in each individual; 20–50× coverage per allele, amplicons were considered admissible and the BAM files were visualized to confirm the read quality and confirm the nucleotide variants; and >50× coverage per allele, amplicons were considered optimal.

Because insertions/deletions (INDELS) are frequently non-detected by the PGM (and other NGS platforms) we performed a specific analysis to reduce the risk for non-detection of true INDELS (false negative). The somatic sample VC default algorithm was set to low sample coverage, minimum allele frequency and minimum variant frequencies of 10,000, 0.01, and 0.01 respectively. In addition, the BAM files of amplicons containing putative INDELS were visualized and those that mapped in only 1 strand were discarded.

Sanger Sequencing and Putative Mutation Assignment

Nucleotide variants that fulfilled the following criteria were considered as putative mutations: had a functional effect (missense, nonsense or frameshifting amino acid changes; pre-mRNA splicing); reported in the Human Genome Mutation database; or classified as likely pathogenic on bioinformatics in silico prediction (Polyphen and SIFT). For each putative mutation in the 7 discovery pools the corresponding DNA were individually amplified and sequenced with BigDye chemistry using ABI3130 equipment (Life Technologies) to identify the mutation carrier. Briefly, the exon containing the nucleotide variant was amplified with primers that matched the flanking introns and PCR fragments were purified and sequenced.¹³^,¹⁴

Results

Validation of Ampliseq HCM Genes

A DNA pool-based strategy was carried out, using custom target multiplex amplification for the main HCM genes (MYH7, MYBPC3, TNNT2, TNNI3, ACTC1, TNNC1, MYL2, MYL3, and TPM1) in only 2 tubes, followed by massive parallel sequencing in the Ion Torrent PGM sequencer. As a first validation step, a pool of DNA from 13 patients (pool 1) was amplified and PGM sequenced in the medium-capacity (100-Mb) 316 semiconductor chip. By processing a single pool we reduced the cost and labor time of individual amplification, barcoding, and library preparation of the 13 DNA.

The array load density was 73% with 344 Mb of nucleotide reads, a mean read length of 138 bp, and 98% of the amplicons having >100× coverage (Table S4). A total of 172 of the 176 amplicons (98%) had optimal reads, and only 4 (2%) had null or poor reads (Figure 2). Among the non-readable amplicons, 2 corresponded to exon 1 of TPM1, 1 to exon 12 of MYBPC3, and 1 to exon 6 of ACTC1. Because these failures were replicated in the 318 arrays and the sequences of the primer pairs were correct, we concluded that the absence of nucleotide reads for the 4 amplicons was likely due to some characteristic that made them refractory to amplification, such as a high GC content (3 of the 4 amplicons had a GC content >60%; Table S5).

Figure 2.

Nucleotide reads for the amplicons of the 9 genes.

The VC identified a total of 45 single-nucleotide variants (SNV) in the validation pool 1 (Table S6). All the control variants in readable amplicons were detected at a threshold frequency >1% (allele frequency range, 3.35–5.42; Table 2). The VC also identified all the variants in readable amplicons previously found through Sanger sequencing of the 13 patients. We identified 2 MYH7 variants non-recognized in the Sanger sequencing: c.T136>C and c.T5345>A. These nucleotide changes were thus false positives. On visual inspection of the BAM files using IGV, c.T136>C overlapped 2 amplicons, and most of the c.5345A were in forward strand sequences (Figure S1).

Table 2. Rare Variants in Sanger to NGS Validation Pool (n=13)

Gene	Nucleotide position	Exon/Intron	cDNA	Effect	Rare variant (%)
MYH7	23898994	Exon 12	c.1128C>T	p.D376D	4.02
MYH7	23900093	Intron 10	c.895+17G>A	None	4.43
MYBPC3	47358997	Exon 24	c.2547C>T	p.V849V	4.01
MYBPC3	47360053	Intron 22	c.2308+18C>G	None	3.54
MYBPC3	47365049	Exon 13	c.1217G>A	p.S406N	3.76
MYBPC3	47370076	Exon 6	c.671_673delTGC	p.L224fs	3.35
MYBPC3	47371598	Exon 4	c.472G>A	p.V158M	3.85
TNNT2	201330429	Exon 14	c.758A>G	p.K260R	4.49
TNNI3	55667958	Intron 3	c.150+13G>A	None	4.88
TNNI3	55668992	Exon 1	c.–35C>A	None	5.42
MYL2	111353556	Exon 3	c.132T>C	p.I44I	3.78
MYL3	46902491	Intron 1	c.130–14G>T	None	3.89
TPM1	63353098	Exon 5	c.523G>A	p.D175N	5.13

NGS, next-generation sequencing.

The VC identified the only INDEL control variant (MYBPC3 c.671_673delTGC; Table 2). In addition, VC identified 10 INDELS but only c.53-11_53-7delCTTCTT in TNNT2 was found in the Sanger sequencing (Table S7). On inspection using IGV, the 9 false positives were false positives that mapped in only 1 strand or in homopolymer regions longer than 5 nucleotides.

Sensitivity and Specificity

To extend the validation, pools 2–4 containing a total of 37 HCM patients underwent NGS followed by Sanger sequencing of the 9 genes. The main data of the Ion PGM runs are summarized in Table S4. In the 3 pools, we replicated the 4 sequencing failures (<20× coverage) previously observed in validation pool 1. The VC identified a total of 52, 56, and 49 SNV in the 3 pools, respectively (Tables S8–S10). We excluded the occurrence of false negative SNV on Sanger sequencing of the 9 genes in all patients used to create the pools. Moreover, we identified a total of 49 variants that were either mutations or polymorphisms seen in only 1 sample inside the corresponding pool (Table 3). Together, these data confirmed the accuracy of the method in avoiding false negatives. The VC also identified the 2 false-positive SNV previously found in validation pool 1. With regard to the INDELS, no false negatives and 3 false positives were found (Tables S11–S13).

Table 3. Rare Variants in NGS to Sanger Validation Pools (n=37)

Gene	Position	Exon/Intron	cDNA	Effect	Frequency	DNA pool
MYH7	23886055	Intron 33	c.4644+22G>A	None	4.5	2
MYH7	23886064	Intron 33	c.4644+12_4644+13delTG	None	4.8	1
MYH7	23886064	Intron 33	c.4644+12_4644+13delTG	None	4.2	2
MYH7	23886155	Exon 33	c.4544T>C	p.T1522T	2.6	2
MYH7	23886264	Intron 32	c.4520–63G>A	None	3.4	1
MYH7	23886504	Exon 32	c.4377G>T	p.K1459N	2.8	1
MYH7	23888371	Intron 29	c.3972+15C>T	None	3.2	3
MYH7	23892799	Exon 24	c.3062C>A	p.T1019N	4.0	2
MYH7	23892950	Intron 23	c.2923–18G>A	None	4.1	3
MYH7	23893034	Intron 23	c.2922+82C>T	None	4.2	3
MYH7	23893995	Exon 22	c.2662C>A	p.Q888K	3.3	3
MYH7	23897077	Exon 16	c.1605A>G	p.E535E	4.1	3
MYH7	23898994	Exon 12	c.1128C>T	p.D376D	3.9	1
MYH7	23898994	Exon 12	c.1128C>T	p.D376D	4.3	2
MYH7	23898994	Exon 12	c.1128C>T	p.D376D	4.7	3
MYH7	23899027	Exon 12	c.1095G>A	p.K365K	2.9	1
MYH7	23899038	Exon 12	c.1084G>A	p.M362V	2.8	1
MYH7	23899793	Exon 11	c.975C>T	p.D325D	4.5	2
MYH7	23900093	Intron 10	c.895+17G>A	None	4.5	1
MYH7	23901012	Exon 7	c.597A>G	p.A199A	4.6	1
MYH7	23901012	Exon 7	c.597A>G	p.A199A	3.6	2
MYH7	23901922	Exon 5	c.428G>A	p.R143Q^†	4.0	2
MYBPC3	47356615	Exon 26	c.2883G>A	p.P961P	4.5	3
MYBPC3	47357416	Intron 25	c.2737+12C>T	None	5.7	2
MYBPC3	47358997	Exon 24	c.2547C>T	p.V849V	3.3	2
MYBPC3	47359014	Exon 24	c.2531_2532 insGA	p.M844fs^†	5.5	1
MYBPC3	47359014	Exon 24	c.2531_2532 insGA	p.M844fs^†	5.3	3
MYBPC3	47360053	Intron 22	c.2308+18C>G	None	3.4	1
MYBPC3	47360053	Intron 22	c.2308+18C>G	None	4.9	2
MYBPC3	47360133	Exon 22	c.2246G>A	p.Y749C	5.5	1
MYBPC3	47364138	Exon 18	c.1615A>G	p.I539V	3	1
MYBPC3	47362642	Intron 18	c.1847+47G>A	None	2.8	2
MYBPC3	47362642	Intron 18	c.1847+47G>A	None	5.2	3
MYBPC3	47364129	Exon 16	c.1624G>C	p.E542Q	4.4	3
MYBPC3	47364248	Exon 16	c.1505G>A	p.R502Q	2.6	3
MYBPC3	47364975	Intron 13	c.1223+68C>T	None	2.8	3
MYBPC3	47367823	Exon 12	c.1025T>A	p.V342D	2.6	1
MYBPC3	47369443	Exon 7	c.786C>T	p.T262T	3.6	3
MYBPC3	47370037	Exon 6	c.710A>G	p.Y237C^†	3.6	1
MYBPC3	47370041	Exon 6	c.706A>G	p.S236G	3.3	3
MYBPC3	47370074	Exon 6	c.671_673delTGC	p.L224fs^†	4.4	2
MYBPC3	47370074	Exon 6	c.671_673delTGC	p.L224fs^†	4.6	3
MYBPC3	47371598	Exon 4	c.472G>A	p.V158M	5.7	1
MYBPC3	47371598	Exon 4	c.472G>A	p.V158M	4.2	3
TNNT2	201328272	Exon 16	c.*66G>A	None	3	2
TNNT2	201330429	Exon 14	c.758A>G	p.K260R	4.5	2
TNNI3	55665410	Exon 6	c.537G>A	p.E179E	4.8	3
TNNI3	55668397	Intron 2	c.180+21G>A	None	4	3
MYL2	111351974	Intron 4	c.274+16_274+17insCT	2.2	None	3
MYL2	111353556	Exon 3	c.132T>C	p.I46I	3.1	2
MYL3	46901019	Exon 4	c.427G>A	p.E143K	3.7	1
MYL3	46902491	Intron 1	c.130–14G>T	None	3.7	1
TPM1	63335074	Exon 1	c.46G>C	p.E16Q	3.4	3
TPM1	63353451	Exon 6	c.689+313A>G	p.A216A	2.7	1
TPM1	63356237	Intron 8	c.898+1393C>T	None	5.5	1
TPM1	63358033	Intron 9	c.898+3189delT	None	5.2	1
TPM1	63356331	Exon 9	c.841A>G	p.M281V	3	2
ACTC1	35083251	Intron 6	c.990+64C>T	None	3.8	2

^†Known control variants. NGS, next-generation sequencing.

A total of 60 patients (pools 1–4) underwent both NGS and Sanger sequencing, with 100% sensitivity of the NGS (no false-negative variants). With regard to the specificity, there were only 2 false-positive SNV (MYH7 c.T136>C and c.T5345>A). As expected, the number of false positives was higher for the INDELS: a total of 20 fulfilled the criteria for being deconvoluted and only 4 were false positives (80% specificity).

PGM of the Discovery Pools

After determining the accuracy to detect rare variants in the validation pools, we sequenced a total of 86 patients in 4 discovery pools (A–D; 20–25 samples per pool). In addition to patients not or partly Sanger sequenced (and negative for mutations; n=76), each pool also contained the DNA from 2–4 patients with known mutations. The main data of the Ion PGM runs are summarized in Table S4. We replicated the 4 NGS failures (<20× coverage) previously observed in the validation pools. Thus, exons 1 of TPM1, 12 of MYBPC3, and 6 of ACTC1 should be Sanger sequenced in all patients as part of mutation screening.

At a 1% threshold the VC identified all the known control mutations in the 4 pools (Table S14; an excel file with all the variants is available upon request from the corresponding author). We also confirmed and assigned to specific patients through Sanger sequencing of the corresponding exon a total of 19 rare nucleotide changes that were classified as probably damaging or variants of uncertain effect (Figure 3; Table 4; Table S15). We found at least 1 putative mutation in 17 of the 76 patients. Two patients were carriers of 2 different variants (MYBPC3 p.A261T+p.E218K, and MYH7 p.L620P+p.K1459N). We also excluded the presence of additional patients with any of the control variants by sequencing the corresponding exon from all the patients in each pool. The VC also reported the 2 false MYH7 SNV in the 4 discovery pools.

Figure 3.

Integrative Genome Viewer and Sanger electropherogram of variants found in the discovery pools, including a G insertion (p.V931fs) in MYBPC3, and nucleotide changes (p.E143Q and p.Y749C) in MYL3 and MYBPC3.

Table 4. Rare Variants, Putative Mutations, in the Discovery Pools

Gene	Nucleotide position	Exon/Intron	cDNA	Effect	ESP frequency	HGMD	HCM history	SIFT	Polyphen
MYH7	23886504	Exon 32	c.4377G>T	p.K1459N	1/4300	Yes	No	Damaging	Probably damaging
MYH7	23895023	Exon 20	c.2167C>G	p.R723C	No	Yes	No	Damaging	Probably damaging
MYH7	23896042	Exon 18	c.1988G>A	p.R663H	1/4300	Yes	Yes	Damaging	Possibly damaging
MYH7	23896823	Exon 16	c.1859T>C	p.L620P^†	No	No	No	Damaging	Probably damaging
MYH7	23902931	Exon 3	c.11C>T	p.S4L	No	Yes	No	Tolerated	Benign
MYBPC3	47355475	Exon 27	c.2992C>G	p.Q998E	No	Yes	Yes	Damaging	Probably damaging
MYBPC3	47359046	Exon 24	c.2498C>T	p.A833V	2/4265	Yes	No	Tolerated	Probably damaging
MYBPC3	47360133	Exon 22	c.2246G>A	p.Y749C^†	No	No	No	Damaging	Probably damaging
MYBPC3	47367816	Exon 12	c.1032C>A	p.D344E^†	No	No	No	Damaging	Possibly damaging
MYBPC3	47369442	Exon 7	c.787G>T	p.G263X^†	No	No	Yes	–	–
MYBPC3	47369975	Exon 6	c.772G>A	p.E258K	No	Yes	Yes	Damaging	Possibly damaging
MYBPC3	47371333	Exon 5	c.646G>A	p.A216T	1/4204	Yes	Yes	Tolerated	Benign
MYBPC3	47371414	Exon 5	c.565G>A	p.V189I	29/4208	No	No	Tolerated	Benign
MYBPC3	47371619	Exon 4	c.451G>A	p.D151N^†	No	No	No	Tolerated	Benign
MYBPC3	47373032	Exon 2	c.50G>A	p.R17Q	1/4204	No	No	Tolerated	Possibly damaging
TNNT2	201328348	Exon 16	c.848G>A	p.R283H	1/4299	Yes	Yes	Damaging	Probably damaging
TNNT2	201328373	Exon 16	c.823C>T	p.R275C	5/4299	Yes	No	Damaging	Probably damaging
TNNI3	55665463	Exon 6	c.484C>T	p.R162W	1/4300	Yes	Yes	Damaging	Probably damaging
ACTC1	35085599	Exon 3	c.301G>A	p.E101>K	No	Yes	No	Damaging	Probably damaging

^†Previously non-reported at Ensembl (www.ensembl.org, Release 74). HCM, hypertrophic cardiomyopathy; HGMD, human gene mutation database.

The 2 INDEL control variants in the discovery pools were successfully detected. In addition, a total of 13 putative INDELS with possible functional effect were identified. After applying the analysis parameters we concluded that none of the 13 putative INDELS fulfilled the quality criteria to be considered true: they mapped in only 1 strand, were present at a high frequency in the 2 arrays, and none of them was reported in the gene variation databases. Moreover, to validate the procedure we Sanger sequenced the patients in the corresponding pools and confirmed that all were false positives.

Because the 4 discovery pools were composed of 20–25 patients while the validation pools contained 12-16 different DNA, we Sanger sequenced the 9 genes in all the patients from the largest discovery pool to confirm that there were no false negatives when a larger number of patients was included in a pool. Pool A contained (in addition to 2 control samples) DNA from 23 patients who had been partly sequenced for the MYH7, MYBPC3, TNNT2, TPM1, and TNNI3. After completing the Sanger sequencing of the 9 genes in these cases, we confirmed the absence of SNV not identified in the NGS and the only 2 false-positive MYH7 changes (Table S16). In reference to the INDELS, we also confirmed the absence of false negatives, and only the TNNT2 c.136-49_136-48insA (previous identified in other pools) false positive was identified (Table S17).

Discussion

The Ion Torrent PGM is a semiconductor (instead of optical) sequencer.¹⁵^–¹⁷ The reported PGM procedures are based on the PCR amplification of DNA from single patients followed by pooling and barcode labeling of each patient’s fragments and NGS sequencing.⁷^,¹⁸^,¹⁹ We developed a procedure to perform multiplex amplification custom-Ampliseq designated to amplify the main HCM genes in only 2 tubes. This procedure avoids the necessity of multiple amplifications per patient, but at the cost of poor of no amplification for some of the exons. Only 4 out of the 176 amplicons failed to amplify and give sequence reads. Although the corresponding exons should be Sanger sequenced from each patient as part of the mutation screening, we consider that this represents a minimum cost compared to the advantage of amplifying >98% of the amplicons in only 2 tubes (Figure 1). It could be argued that the multiplex amplification might be optimized by redesigning the primer pairs for the non-readable amplicons. The maximum read capacity of the PGM (and other NGS procedures), however, is currently limited to approximately 200 bp, and this results in practical constraints in designing PCR primers around the targeted exons, specially when they are embedded in GC-rich regions. This strategy of multiplex amplification followed by PGM sequencing has already been tested in a HCM cohort.²⁰ Compared to the present study, however, all the patients were previously Sanger sequenced, the covered genes were not the same, and the authors amplified each patient’s DNA individually.

In addition to amplifying all the target exons in only 2 tubes, we also sequenced DNA pools. This approach would also reduce the cost of processing single individuals, a fact that would reduce the cost of screening large numbers of individuals. The amplification of single fragments from DNA pools has been used to re-sequence and discover rare variants linked to common diseases, as well to uncover mutations in mendelian disorders.²¹^–²³ Although DNA-pools have been successfully sequenced with other NGS platforms, some authors concluded that barcoding of individual samples before pooling (rather than a genomic DNA pooling strategy) is preferred to avoid false negatives.¹⁰ If a fragment was not amplified from a particular DNA in the pool, the patient should be wrongly classified as a non-mutation carrier. We think this was unlikely in the present study because we excluded the occurrence of false negatives in 60 patients who were both Sanger and NGS sequenced for the 9 genes. Also, all the pools used high-quality DNA previously assayed and quantified through Taqman assays. In addition, the power to detect rare mutations could be increased by sequencing overlapping population pools in which each individual occurs in 2 pools.²⁴ One of the main limitations of the PGM (and other NGS platforms) is the presence of false positives, mainly in homopolymer regions >4 nucleotides. With the present VC criteria the specificity for INDELS was 80%, with a sensitivity of 100% in detecting all the true variants inside the pools.

Once validated, we searched for mutations in a total of 76 patients distributed in 4 pools. In addition to new cases, we also included patients who had been partially sequenced or studied through indirect techniques (such as single-strand conformation analysis). We successfully identified the control variants in these discovery pools, and also found a total of 19 rare nucleotide changes that were classified as probably damaging or variants of uncertain effect. All them were confirmed and assigned to 17 patients via Sanger sequencing of the corresponding exon. Two patients were double mutation carriers, a condition that has been linked to poor prognosis.²⁵ In a total of 11 of these index cases we performed family studies and identified a total of 26 mutation carriers (data not shown).

The validation pools contained fewer samples than the discovery pools, and it might be argued that this could affect the results in terms of lower specificity and sensitivity when larger pools undergo NGS. We think this was unlikely in the present study, because the 9 genes were Sanger sequenced in all the 25 patients from the largest pool and we did not find false-negative variants: they were the same false positives as in the validation pools. Although we were able to identify unique nucleotide variants in a pool of 25 individuals (1/50 alleles), the lower level of detection was not identified. Thus, it is possible that rare unique variants could be read over the sequencing noise level in even larger pools.

An issue to consider when searching for rare variants in DNA pools is the number of Sanger sequences required to characterize the mutation carriers in the pool. Among the 4 discovery pools, the larger number of mutations (n=6) was found in pool D, containing 22 patients. Thus, a total of 132 PCR fragments were amplified and 264 Sanger reads (forward+reverse strands) generated. In comparison, a total of 128 amplicons+5,632 sequence reads would be required for the Sanger sequencing of the 9 genes in the 22 patients. In the case that the number of putative mutations equals the number of samples (N) in the pool and all the mutations are in different amplicons, a total of N² single amplicons would need to be Sanger sequenced. In this way, to identify all the putative mutations would be impractical in cost and labor time for large DNA pools. In this study we created 8 libraries to sequence a total of 136 HCM patients for the price of only approximately €2,400 in library preparation. If we had performed this study using individual barcode samples, the cost of library preparation would have risen to €40,800, 17-fold greater. In addition, 2 technicians carried out the NGS and Sanger assignment of the putative mutations in the 76 patients from the discovery pools in only 4 weeks, a time much shorter than that required to sequence the 9 genes in all the patients on Sanger sequencing.

Finally, NGS technology has generated high-throughput sequencing data in genes linked to cardiomyopathies, in patients and in apparently healthy individuals.⁶^,⁹ The Exome Sequencing project found rare variants (ESP database, http://evs.gs.washington.edu) previously identified in HCM patients, a fact that has questioned the pathogenicity of some missense nucleotide changes previously considered as mutations.²⁶^,²⁷ At this stage, the present procedure would facilitate rapid and cost-effective sequencing of the main HCM-associated genes in large sets of individuals.

Conclusions

We report the massive parallel sequencing of the genes most commonly mutated in HCM patients. The present procedure would facilitate the rapid and cost-effective screening of these genes on a population scale.

Acknowledgments

This work was supported by a grant from Instituto de Salud Carlos III-Fondo Europeo de Desarrollo Regional (FIS-12/00287; RD12/0021/0012).

Disclosures

The authors declare no conflict of interest.

Authors Contribution: J.R.R., C.M., M.M., recruited the patients and performed the clinical and echographic studies; J.G., V.A., B.A., S.I., and E.C. performed the genetic studies; J.G. and E.C. wrote the ms. All the authors have seen and approved the final version of the ms.

Supplementary Files

Supplementary File 1

Table S1. Samples used to create the 8 pools

Table S2. Ampliseq amplicons and coverage details

Table S3. Ampliseq for the 9 genes

Table S4. Ion torrent PGM run characteristics (n=3)

Table S5. GC content in 4 unreadable amplicons

Table S6. Rare SNV in sanger to NGS validation pool 1 (n=13)

Table S7. INDELS in sanger to NGS validation pool 1

Table S8. Rare SNV in NGS to sanger validation pool 2

Table S9. Rare SNV in NGS to sanger validation pool 3

Table S10. Rare SNV in NGS to sanger validation pool 4

Table S11. INDELS variants in NGS to sanger validation pool 2

Table S12. INDELS variants in NGS to sanger validation pool 3

Table S13. INDELS variants in NGS to sanger validation pool 4

Table S14. Control variants in discovery pools

Table S15. Rare variants in discovery pools A–D

Table S16. Rare SNV in discovery pool A

Table S17. INDELS variants in discovery pool A

Figure S1. Examples of de Integrative Genome Viewer for falsepositive single-nucleotide variant and indels, with the corresponding Sanger sequence.

Please find supplementary file(s);

http://dx.doi.org/10.1253/circj.CJ-14-0628

References

1. Núñez L, Gimeno-Blanes JR, Rodríguez García MI, Rodríguez-García MI, Monserrat L, Zorio E, et al. Somatic MYH7, MYBPC3, TPM1, TNNT2, and TNNI3 mutations in sporadic hypertrophic cardiomyopathy. Circ J 2013; 77: 2358–2365.
2. Konno T, Chang S, Seidman JG, Seidman CE. Genetics of hypertrophic cardiomyopathy. Curr Opin Cardiol 2010; 25: 205–209.
3. Otsuka H, Arimura T, Abe T, Kawai H, Aizawa Y, Kubo T, et al. Prevalence and distribution of sarcomeric gene mutations in Japanese patients with familial hypertrophic cardiomyopathy. Circ J 2012; 76: 453–461.
4. Niwano S. Multicenter study of the prevalence and distribution of sarcomeric gene mutations in familial hypertrophic cardiomyopathy: A milestone for genetic diagnosis in the Japanese population. Circ J 2012; 76: 303–304.
5. Teekakirikul P, Kelly MA, Rehm HL, Lakdawala NK, Funke BH. Inherited cardiomyopathies: Molecular genetics and clinical genetic testing in the postgenomic era. J Mol Diagn 2013; 15: 158–170.
6. Wheeler M, Pavlovic A, DeGoma E, Salisbury H, Brown C, Ashley EA. A new era in clinical genetic testing for hypertrophic cardiomyopathy. J Cardiovasc Transl Res 2009; 2: 381–391.
7. Costa JL, Sousa S, Justino A, Kay T, Fernandes S, Cirnes L, et al. Nonoptical massive parallel DNA sequencing of BRCA1 and BRCA2 genes in diagnostic setting. Hum Mutat 2013; 34: 629–635.
8. Meder B, Haas J, Keller A, Heid C, Just S, Borries A, et al. Targeted next-generation sequencing for the molecular genetic diagnostics of cardiomyopathies. Circ Cardiovasc Genet 2011; 4: 110–122.
9. Lopes LR, Zekavati A, Syrris P, Hubank M, Giambartolomei C, Dalageorgou C, et al. Genetic complexity in hypertrophic cardiomyopathy revealed by high-throughput sequencing. J Med Genet 2013; 50: 228–239.
10. Harakalova M, Nijman IJ, Medic J, Mokry M, Renkens I, Blankensteijn JD, et al. Genomic DNA pooling strategy for next-generation sequencing-based rare variant discovery in abdominal aortic aneurysm regions of interest: Challenges and limitations. J Cardiovasc Transl Res 2010; 4: 271–280.
11. Otto EA, Ramaswami G, Janssen S, Chaki M, Allen SJ, Zhou W, et al. Mutation analysis of 18 nephronophthisis associated ciliopathy disease genes using a DNA pooling and next generation sequencing strategy. J Med Genet 2011; 48: 105–116.
12. García-Castro M, Reguero JR, Batalla A, Díaz-Molina B, González P, Alvarez V, et al. Hypertrophic cardiomyopathy: Low frequency of mutations in the beta-myosin heavy chain (MYH7) and cardiac troponin T (TNNT2) genes among Spanish patients. Clin Chem 2003; 49: 1279–1285.
13. García-Castro M, Coto E, Reguero JR, Berrazueta JR, Alvarez V, Alonso B, et al. Mutations in sarcomeric genes MYH7, MYBPC3, TNNT2, TNNI3, and TPM1 in patients with hypertrophic cardiomyopathy. Rev Esp Cardiol 2009; 62: 48–56 (in Spanish).
14. Coto E, Reguero JR, Palacín M, Gómez J, Alonso B, Iglesias S, et al. Resequencing the whole MYH7 gene (including the intronic, promoter, and 3’ UTR sequences) in hypertrophic cardiomyopathy. J Mol Diagn 2012; 14: 518–524.
15. Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature 2010; 475: 348–352.
16. Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, Wain J, et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol 2012; 30: 434–439.
17. Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, et al. A tale of three next generation sequencing platforms: Comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 2012; 13: e341, doi:10.1186/1471-2164-13-341.
18. Li X, Buckton AJ, Wilkinson SL, John S, Walsh R, Novotny T, et al. Towards clinical molecular diagnosis of inherited cardiac conditions: A comparison of bench-top genome DNA sequencers. PLoS One 2013; 8: e67744, doi:10.1371/journal.pone.0067744.
19. Kim HS, Sung JS, Yang SJ, Kwon NJ, Jin L, Kim ST, et al. Predictive efficacy of low burden EGFR mutation detected by next-generation sequencing on response to EGFR tyrosine kinase inhibitors in non-small-cell lung carcinoma. PLoS One 2013; 8: e81975, doi:10.1371/journal.pone.0081975.
20. Millat G, Chanavat V, Rousson R. Evaluation of a new NGS method based on a custom AmpliSeq library and Ion Torrent PGM sequencing for the fast detection of genetic variations in cardiomyopathies. Clin Chim Acta 2014; 433C: 266–271.
21. Nejentsev S, Walker N, Riches D, Egholm M, Todd JA. Rare variants of IFIH1, a gene implicated in antiviral responses, protect against type 1 diabetes. Science 2009; 324: 387–389.
22. Jin SC, Pastor P, Cooper B, Cervantes S, Benitez BA, Razquin C, et al. Pooled-DNA sequencing identifies novel causative variants in PSEN1, GRN and MAPT in a clinical early-onset and familial Alzheimer’s disease Ibero-American cohort. Alzheimers Res Ther 2012; 4: e34, doi:10.1186/alzrt137.
23. Gómez J, Reguero JR, Morís C, Alvarez V, Coto E. Non optical semi-conductor next generation sequencing of the main cardiac QT-interval duration genes in pooled DNA samples. J Cardiovasc Transl Res 2014; 7: 133–137.
24. Missirian V, Comai L, Filkov V. Statistical mutation calling from sequenced overlapping DNA pools in TILLING experiments. BMC Bioinformatics 2011; 12: e287, doi:10.1186/1471-2105-12-287.
25. Maron BJ, Maron MS, Semsarian C. Double or compound sarcomere mutations in hypertrophic cardiomyopathy: A potential link to sudden death in the absence of conventional risk factors. Heart Rhythm 2012; 9: 57–63.
26. Bick AG, Flannick J, Ito K, Cheng S, Vasan RS, Parfenov MG, et al. Burden of rare sarcomere variants in the Framingham and Jackson Heart Study cohorts. Am J Hum Genet 2012; 91: 513–519.
27. Ng D, Johnston JJ, Teer JK, Singh LN, Peller LC, Wynter JS, et al. Interpreting secondary cardiac disease variants in an exome cohort. Circ Cardiovasc Genet 2013; 6: 337–346.

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）