The Keio Journal of Medicine
Online ISSN : 1880-1293
Print ISSN : 0022-9717
ISSN-L : 0022-9717

This article has now been updated. Please use the final version.

A Survey of Genome Editing Activity for 16 Cas12a Orthologs
Bernd ZetscheJonathan StreckerOmar O. AbudayyehJonathan S. GootenbergDavid A. ScottFeng Zhang
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML Advance online publication
Supplementary material

Article ID: 2019-0009-OA

Details
Abstract

The class 2 CRISPR-Cas endonuclease Cas12a (previously known as Cpf1) offers several advantages over Cas9, including the ability to process its own array and the requirement for just a single RNA guide. These attributes make Cas12a promising for many genome engineering applications. To further expand the suite of Cas12a tools available, we tested 16 Cas12a orthologs for activity in eukaryotic cells. Four of these new enzymes demonstrated targeted activity, one of which, from Moraxella bovoculi AAX11_00205 (Mb3Cas12a), exhibited robust indel formation. We also showed that Mb3Cas12a displays some tolerance for a shortened PAM (TTN versus the canonical Cas12a PAM TTTV). The addition of these enzymes to the genome editing toolbox will further expand the utility of this powerful technology.

Introduction

The ability to edit the genome of living cells enables a broad range of downstream genetic analyses and has the potential for therapeutic use to resolve pathogenic mutations. Over the past several years, enzymes from CRISPR-Cas systems, which provide bacteria and archaea with adaptive immunity, have emerged as powerful tools for eukaryotic gene editing. In nature, CRISPR-Cas systems acquire DNA snippets that match invading viruses or foreign nucleic acids, creating a memory bank of infection. These snippets are then transcribed into short RNA guides, which are used by Cas proteins to detect invading nucleic acids. Once a sequence match is found, Cas nucleases destroy the foreign nucleic acid. In particular, Class 2 CRISPR-Cas systems are well-suited for development as molecular technologies because they contain single effector enzymes. These effector enzymes, such as Cas9, are RNA-guided DNA endonucleases that have been harnessed for a range of genome engineering applications.1

Although Cas9 was the first such enzyme to be developed as a genome editing tool,2,3 three orthologs of Cas12a (a single RNA-guided class 2 effector previously known as Cpf1) from Francisella novicida U112 (FnCas12a), Acidaminococcus sp. BV3L6 (AsCas12a), and Lachnospiraceae bacterium ND2006 (LbCas12a), have also been used for genome editing in eukaryotic cells.4,5,6,7,8 Endonucleases of the Cas12a family differ from the Cas9 family in several ways: (i) Cas12a utilizes T-rich protospacer adjacent motifs (PAMs) located 5′ of the targeted DNA sequence, (ii) target cleavage occurs distally from the PAM and results in sticky-end overhangs, (iii) Cas12a is guided by a single CRISPR RNA (crRNA) and does not require trans-activating CRISPR RNA; and (iv) Cas12a possesses both RNase and DNase activity, which allows it to process its own CRISPR array.7,9 These features make Cas12a particularly useful in certain situations, such as targeting AT-rich genomic regions and multiplexed gene targeting.8,10 Additionally, Cas12a has been shown to possess non-specific single-stranded DNA cleavage activity after it has been activated by target binding, which has been leveraged for nucleic acid detection.11,12,13,14 Finally, Cas12a is more specific than Cas9 in certain contexts, making it well-suited to applications in which high specificity is critical.15,16

Given previous work showing that different Cas enzyme orthologs exhibit a range of activity in eukaryotic cells2,16,17 and indicating the potential advantages of Cas12a, we sought to identify additional Cas12a orthologs with high activity in eukaryotic cells. Here we examine 16 new Cas12a-family proteins for nuclease activity in human cells. We identify four orthologs that can induce insertion/deletion (indel) events at targeted genomic loci. One ortholog, from Moraxella bovoculi AAX11_00205 (Mb3Cas12a), exhibited comparable activity to AsCas12a and LbCas12a when targeting sites containing TTTV (V=A, C, or G) PAMs. We also show that Mb3Cas12a can recognize a TTN PAM, but with lower efficiency than the conserved TTTV PAM. Together, these new orthologs expand the genome editing toolbox, providing new enzymes that can be used for tailored applications.

Materials and Methods

Computational search for Cas12a orthologs. Cas12a orthologs were selected as previously described.7

Cell culture and transfection. HEK293T cells were maintained at 37°C with 5% CO2 in Dulbecco’s Modified Eagle Medium (Gibco) supplemented with 10% fetal bovine serum (HyClone) and 2 mM GlutaMAX (Life Technology). For indel analysis, 22,000 cells were seeded per well of a 96-well plate (Corning) 1 day before transfection. Each well was transfected with 100 ng Cas12a-encoding plasmid (see the supplemental file) and 50 ng guide-encoding plasmid or PCR fragment, or 150 ng Cas12a and guide-encoding plasmid, using Lipofectamine 2000 (Thermo Fisher Scientific). Cells were harvested 3 days after transfection using QuickExtract DNA extraction solution according to the manufacturer’s protocol and analyzed by surveyor assay or deep sequencing. To generate Cas12a-containing whole-cell lysate, 120,000 cells were seeded per well of a 24-well plate (Corning) 1 day before transfection. Each well was transfected with 500 ng Cas12a-encoding plasmid, and cell lysate was harvested 2 days after transfection.

In vitro PAM identification assay. The in vitro PAM identification assay was performed as described previously.18 Briefly, whole-cell lysate from HEK293T cells overexpressing one of the Cas12a orthologs was prepared with lysis buffer (20 mM HEPES, 100 mM KCl, 5 mM MgCl2, 1 mM DTT, 5% glycerol, 0.1% Triton X-100) supplemented with EDTA-free Complete Protease Inhibitor Cocktail (Roche). CrRNA with corresponding direct repeat sequences were transcribed in vitro using custom oligonucleotides and a HiScribe T7 in vitro Transcription Kit (NEB) according to the manufacturer’s recommended protocol for small RNA transcripts. The PAM library consisted of a pUC19 plasmid carrying a degenerate 8-bp sequence 5′ of a 33-bp target site.7 The library was pre-cleaved with XmnI and column purified prior to use (Qiagen). Each in vitro cleavage reaction consisted of 1 μl 10× CutSmart buffer (NEB), 200 ng PAM library, 500 ng in vitro transcribed crRNA, 10 μl cell lysate, and water for a total volume of 20 μl. Reactions were incubated at 37°C for 1 h and stopped by adding 500 μl buffer PB (Qiagen) followed by column purification. Purified DNA was amplified and sequenced using a MiSeq (Illumina) with a single-end 150-cycle kit. Sequencing results were entered into the PAM discovery pipeline.7

Surveyor assay. The surveyor assay was performed as previous described.19 Briefly, genomic regions flanking a target site for each gene were amplified by PCR, and the products were purified using a QiaQuick Spin Column (Qiagen). Total purified PCR products (400 ng) were mixed with 2 µL 10 Taq DNA Polymerase buffer (Enzymatics) and ultrapure water to a final volume of 20 µL. Re-annealing was achieved by heating to 95°C for 2 min followed by a slow cool down to 10°C (∼2.5°C per min). Re-annealed products were treated with surveyor nuclease (IDT) according to the manufacturer’s protocol. Cleavage products were then visualized on 10% Novex TBE polyacrylamide gels (Life Technologies). Gels were stained with SYBR Gold DNA stain (Life Technologies) for 10 min and imaged with a Gel Doc imaging system (Bio-Rad).

Deep Sequencing. Targeted regions were amplified using a previously described two-step PCR protocol.19 Indels were counted computationally as previously described.18 Briefly, each amplicon was searched for exact matches within a 70-bp window around the cut site. For each sample, the indel rate was determined as (number of reads with indel) / (number of total reads). Samples with fewer than 1000 total reads were not included in subsequent analyses.

Results

We selected 16 uncharacterized Cas12a-family proteins with varying degrees of homology to three Cas12a orthologs (FnCas12a, AsCas12a, and LbCas12a)4,7 with confirmed activity in eukaryotic cells (Fig. 1A). The direct repeat (DR) sequences of crRNAs associated with Cas12a orthologs show high levels of homology (Fig. 1B) and are predicted to fold into almost identical secondary structures (Fig. 1C). The homology is particularly strong at the stem structure and the AAUU motif (Fig. 1C), which is required for efficient crRNA maturation,9 suggesting that the mechanism of crRNA maturation may be conserved within the Cas12a-family.

Fig. 1

Analysis of Cas12a ortholog diversity.

(A) Phylogenetic tree of 16 new Cas12a orthologs and 3 Cas12a orthologs with confirmed activity in eukaryotic cells (1-FnCas12a, 7-AsCas12a, and 13-LbCas12a). The approximate location of the RuvC subdomains and the nuclease (Nuc) domain are shaded in blue and pink respectively. (B) Alignment of direct repeat sequences of Cas12a orthologs. Sequences that are removed post crRNA maturation are shown in gray letters. Non-conserved bases are colored red. The stem duplex is shaded gray. (C) RNA secondary structures for mature crRNAs from 1-FnCas12a, 7-AsCas12a, 13-LbCas12a, and 23-Mb3Cas12a, predicted using Geneious 2 software.

We performed a previously described in vitro assay18 to determine the sequence of the PAM for each Cas12a ortholog (Fig. 2A). Of the 16 new Cas12a proteins, ten were active in vitro and recognized a T-rich PAM located 5′ of the targeted sequence (Fig. 2B), just as previously characterized Cas12a proteins do.7

Next, we tested the 16 Cas12a orthologs for activity in human cells. We chose a previously validated target within VEGFA, located next to a TTTG PAM that is permissive to all Cas12a orthologs. HEK293T cells were transfected with plasmids encoding humanized Cas12a orthologs together with PCR amplified fragments comprising a U6 promoter fused to the corresponding crRNA sequence (Fig. 3A). Four of the new Cas12a orthologs [Thiomicrospira sp. Xs5 (TsCas12a), Moraxella bovoculi AAX08_00205 (Mb2Cas12a), Moraxella bovoculi AAX11_00205 (Mb3Cas12a), and Butyrivibrio sp. NC3005 (BsCas12a)] were able to induce detectable indel events, as measured by surveyor nuclease assay (Fig. 3B). We tested these orthologs with six additional guides targeting either DNMT1 or EMX1 next to TTTV PAMs (Fig. 3C) and compared them to the activity of AsCas12a and LbCas12a. For all four new Cas12a enzymes, indel frequencies of >20% could be detected for at least two guides, but only Mb3Cas12a was able to induce robust indel levels with all six guides comparable to those of AsCas12a and LbCas12a. The apparent difference in activities between Mb2Cas12a and Mb3Cas12a was somewhat surprising given that these orthologs share a predicted homology of 94.7%.

Fig. 2

PAM identification for Cas12a orthologs.

(A) Schematic for in vitro PAM screen. A library of plasmids bearing randomized 5′ PAM sequences was cleaved by individual Cas12a nucleases and their corresponding crRNAs. Uncleaved plasmid DNA was PCR amplified and sequenced to identify depleted PAM sequences. (B) PAM sequences for ten Cas12a orthologs identified by in vitro PAM screens.

Fig. 3

Activity of Cas12a orthologs in human cells.

(A) Sixteen human codon-optimized Cas12a orthologs were expressed in HEK293T cells using CMV-driven expression vectors. The corresponding crRNA was expressed from PCR amplified fragments containing a U6 promoter fused to the crRNA sequence. NLS, nuclear localization signal.(B) Comparison of in vitro activity using a pre-validated guide targeting VEGFA next to a TTTV (V=A, C, or G) PAM. Indel frequencies were detected by surveyor assay. Red triangles indicate cleaved fragments. The percent indel frequencies are the averages of three bioreplicates. (C) The activities of four new Cas12a orthologs compared to AsCas12a and LbCas12a using six guides targeting either EMX1 or DNMT1. Each data point represents one guide; indel frequencies were determined by surveyor assay and are shown as the means of all guides with SEMs.

Because Mb3Cas12a was predicted to recognize a less restrictive PAM than the TTTV consensus PAM of AsCas12a and LbCas12a (Fig. 2B), we tested the ability of Mb3Cas12a to cleave endogenous DNA at TTN PAMs. To this end, we designed 64 guides: 16 guides for DNMT1, EMX1, GRIN2b, or VEGFA, targeting next to any combination of NTTN PAMs. To compare the activity of Mb3Cas12a, AsCas12a, and LbCas12a at NTTN PAMs, we transfected HEK293T cells with two plasmids, one expressing Cas12a and one expressing the crRNA, and assessed indel frequencies at each target site by deep sequencing. The average activity at TTTV PAMs was approximately 18% for Mb3Cas12a, 28% for AsCas12a, and 13% for LbCas12a (Fig. 4A). A few guides targeting next to NTTN PAMs (three for MbCas12a and one for AsCas12a) resulted in activity between 25–45% indels. However, whereas Mb3Cas12a performed better than AsCas12a and LbCas12a at NTTN PAMs, the average activity was relatively low with approximately 5.3% for Mb3Cas12a, approximately 2.7% for AsCas12a, and approximately 1.4% for LbCas12a. Comparing activity across all VTTN PAMs revealed statistically significant differences in indel activity between Mb3Cpf1 and LbCpf1 (mean 5.26% vs 1.38%, P = 0.0117), but not between Mb3Cpf1 and AsCpf1 (mean 5.26% vs 2.69%, P = 0.1343)

Fig. 4

Evaluation of activity with relaxed PAM sequences.

(A) Mb3Cas12a, AsCas12a, and LbCas12a were tested for recognition of NTTN PAMs using four guides per PAM, targeting four different genes (DNMT1, EMX1, GRIN2b, or VEGFA). Indel frequencies were determined by deep sequencing. Each data point represents the average of three bioreplicates for one guide. Data are shown as means with SEMs. (B) Mb3Cas12a was tested with 18 guides targeting either DNMT1 or EMX1 next to a RTTN and NYYN PAM (R=A or G, Y=C or T). Each data point represents one guide; data are shown as means with SEMs.

Based on the in vitro PAM screen, Mb3Cas12a tolerates Cs or Ts within its PAM. To assess the tolerance for Cs at position 2 and 3 of the Mb3Cas12a PAM, we used 18 guides targeting DNMT1 or EMX1 next to RTTN and NYYN PAMs (R=A or G, Y=C or T). HEK293T cells were transfected with a single plasmid expressing Mb3Cas12a and crRNA. The activity of each guide was determined using the surveyor nuclease assay. Guides targeting next to RTTN, RCTN, and RTCN PAMs had an average activity of approximately 15%, approximately 9%, and approximately 4%, respectively; however, guides targeting next to RCCN PAMs were mostly inactive (Fig. 4B). Taken together, our data show that Mb3Cas12a is active in human cells and shows robust activity at TTTV PAMs at levels comparable to those of AsCas12a and LbCas12a. Furthermore, Mb3Cas12a can reliably target sites with RTTV PAMs, albeit with lower overall activity.

Discussion

Reaching the full potential of CRISPR-based genome editing will require a suite of tools to ensure that there are optimal enzymes for a range of genomic contexts. This will be particularly important for the therapeutic deployment of CRISPR, where the target site will be constrained by the genetic variations found in individual patients. Moreover, to tackle the full landscape of pathogenic mutations, a number of different gene editing strategies will be needed beyond simple gene knockout. Consequently, having an array of Cas enzymes that can be used in human cells is essential for the continued development of this technology.

Here we examined 16 new Cas12a family proteins for potential use in genome editing. Four of these, Thiomicrospira sp. Xs5 (TsCas12a), Moraxella bovoculi AAX08_00205 (Mb2Cas12a), Moraxella bovoculi AAX11_00205 (Mb3Cas12a), and Butyrivibrio sp. NC3005 (BsCas12a), exhibited activity in human cells. We chose HEK293 cells as a model for exploring these new Cas12a orthologs because there is a wealth of published data available on the efficiencies of other Cas enzymes in these cells. Previously, we observed only weak activity of FnCas12a in mammalian cells.7 However, a recent study found that FnCas12a exhibits robust activity in plant cells,4 indicating that Cas12a orthologs might have different activities depending on the organism. Therefore, it may be informative in future studies to test the activities of these new Cas12a orthologs in different cell types and organisms.

Further analysis of the PAM requirements of the most active new ortholog, Mb3Cas12a, showed that it has a less restricted PAM (TTV) than AsCas12a and LbCas12a, which are active only at the canonical TTTV PAM. Alignment of Mb3Cas12a to other Cas12a orthologs did not suggest any immediate reason for the more relaxed PAM (data not shown), and further work will be required to investigate the structural basis for this altered PAM requirement.

Given the advantageous properties of Cas12a, such as its inherent high specificity and distinct PAM preference, this family of enzymes represents a powerful addition to the gene editing toolbox. Here, we further expanded the utility of Cas12a by identifying new orthologs that are active in human cells.

Author Contributions

B.Z., J.S., and F.Z. conceived this study. B.Z. and J.S. performed the experiments with help from all authors. O.A. and J.G. analyzed PAM detection data. D.S. contributed to computational analysis of Cas12 orthologs. F.Z. supervised the research. B.Z. and F.Z. wrote the manuscript with input from all authors.

Acknowledgments

We thank R. Macrae, R. Belliveau, G. Faure, and L. Gao for discussions and support. J.S. is supported by the Human Frontier Science Program. F.Z. is a New York Stem Cell Foundation–Robertson Investigator. F.Z. is supported by National Institutes of Health grants (1R01-HG009761, 1R01-MH110049, and 1DP1-HL141201); the Howard Hughes Medical Institute; the New York Stem Cell, Edward Mallinckrodt, Jr., and G. Harold and Leila Mathers Foundations; the Poitras Center for Psychiatric Disorders Research at MIT; the Hock E. Tan and K. Lisa Yang Center for Autism Research at MIT; J. and P. Poitras; and the Phillips Family. F.Z. is a co-founder and advisor of Beam Therapeutics, Editas Medicine, Arbor Biotechnologies, Sherlock Biosciences, and Pairwise Plants. The authors plan to make the reagents widely available to the academic community through Addgene and to provide software tools via the Zhang lab website (zlab.bio). A patent has been filed relating to the presented data.

Conflicts of Interest

The authors have declared that no conflict of interest exists.

References
 
© 2019 by The Keio Journal of Medicine
feedback
Top