The Journal of Toxicological Sciences
Online ISSN : 1880-3989
Print ISSN : 0388-1350
ISSN-L : 0388-1350
Review
CRISPR approach in environmental chemical screening focusing on population variability
Nivedita ChatterjeeXiaowei Zhang
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2021 Volume 46 Issue 11 Pages 499-507

Details
Abstract

A significant barrier to include population variability in risk assessment is our incomplete understanding of inter-individual variability and the differential susceptibility to environmental exposures induced adverse outcomes. By combining genome editing tools with the population diversity model, this article intended to highlight a potential strategy to identify and characterize the inter-individual variability factors, the determinant gene anchoring to a particular phenotype. The goal could be achieved by integrating the perturbed CRISPR-based unbiased functional genomics screening, genome-wide or a focused subset of genes, in a population-based in vitro model system (such as the lymphoblastoid cell lines, LCL, available from HapMap and 1000 Genomes project). Then data can be translated to genetic variability and individual (or subpopulation) susceptibility by incorporating ethnicity and corresponding genome-wide association studies (GWAS) with functional genomics screening results. This approach can provide complementary data for next-generation risk assessment, in particular, for environmental stressors. The current paper outlined the previous work conducted with a population-based in vitro model system, perturbed CRISPR-based functional toxicogenomic screening of environmental chemicals, and finally, the potential strategies to combine these two platforms with their opportunities and challenges to achieve a mechanistic understanding of population variability.

INTRODUCTION

The World Health Organization (WHO) estimated around 24% of the global disease burden and about 23% of deaths from environmental exposures, specifically toxic stressors (Landrigan et al., 2016; Prüss-Ustün et al., 2017). The environmental exposures induced adverse health outcomes and diseases vary among individuals across the human population as people respond differently, in the degrees and ways, to the same environmental stressors, including chemicals exposures. This differential disease susceptibility is associated with inter-individual variability (e.g., genetic background, epigenetic factors, life stage, aging, etc.) and their environmental (chemicals exposures, food, nutritional background, previous and concurrent stressors, etc.) interactions. Recent studies shed light on the fact (pieces of evidence showed) that intrinsic genetic variation in genetically diverse populations contributes to observed differential susceptibility to environmental stressors. For example, the effect of copper was evidenced on the 1% of people who carry genes linked to Wilson’s disease (National Academies of Sciences, Engineering, and Medicine, 2016). Hence, the characterization of causality, these intrinsic and extrinsic factors, and their mechanism of interactions, which contribute to the differential disease susceptibility, are critical considerations in environmental health protection and risk assessment.

Nonetheless, understanding and detection of these critical aspects remain a significant challenge for formal epidemiological studies. In most cases, the traditional epidemiological studies generalize the effects on the overall population-based on either occupational or cohorts and thus have been limited to characterize the factors contributing to differential susceptibility. Moreover, detailed analysis of the individual’s exposome in epidemiological studies is just in their initialization phase. Hence, there is a need to incorporate biological variability into laboratory models for a more realistic assessment of the risks associated with heterogeneous human population, in particular, for environmental exposures (National Academies of Sciences, Engineering, and Medicine, 2016; National Research Council, 2009; Zeise et al., 2013; Balik-Meisner et al., 2018; Chiu and Rusyn, 2018; Dornbos and LaPres, 2018). New opportunities have emerged as state-of-the-art tools – the in vitro population-based paradigms already successfully addressed the intrinsic variability in genome-wide and exposure-wide association studies for assessing human population variability, identifying susceptible subpopulation (National Academies of Sciences, Engineering, and Medicine, 2016; Zeise et al., 2013; Chiu and Rusyn, 2018). The studies assayed the response of environmental chemicals exposures to the in vitro population-based model, the collection of lymphoblastoid cell lines from geographically and ancestrally diverse populations. The incorporation of population diversity in environmental toxicological screening studies has shown the power of a more precise level of risk assessment of environmental exposures with an emphasis on genetic variability (Dornbos and LaPres, 2018; O'Shea et al., 2011; Lock et al., 2012; Abdo et al., 2015b, 2015a).

Population-based in vitro studies in environmental chemicals screening

Genetically diverse but well-defined (genotyping and GWAS available) human lymphoblastoid cell lines (LCL), such as from the HapMap (http://hapmap.ncbi.nlm.nih.gov/) and 1000 Genomes (http://www.1000genomes.org/) projects, have been exploited as a source of cell-based epidemiological model systems to identify the genetic susceptibility and population variability towards various drugs as well as environmental chemicals toxicity (Zeise et al., 2013; Dornbos and LaPres, 2018). Such models have been successfully used to probe inter-individual and inter-population variability in response to environmental chemicals exposures by standardized dose-response profiling and GWAS anchoring to toxicity phenotypes.

LCL Based population-based in vitro model

Two previous studies evidenced 14 environmental chemicals (O'Shea et al., 2011) and 240 chemicals (Lock et al., 2012) screening cytotoxicity and apoptosis endpoints (1) ATP production and 2) caspase-3/7 activity) on the panel of individual LCLs from the HapMap Consortium’s CEPH (Centre d’Etude du PolymorphismHumain). Both studies mainly documented that some, but not all, chemicals were cytotoxic to all cell types at similar concentrations, while others showed inter-individual variations in a toxic response. Further, the inclusion of GWAS (O'Shea et al., 2011; Lock et al., 2012) and RNA-seq (Lock et al., 2012) highlighted that several SNPs on particular chromosomes as potential determinants factors of inter-individual susceptibility. In brief, these studies support that incorporation of genetic diversity in chemicals toxicity evaluation is now feasible through the population-based LCL cell lines.

Population variability of chemical-specific cytotoxicity has been estimated at the subpopulation level. Quantitative human toxicodynamic variability has been addressed using 1,086 lymphoblastoid cell lines (source: 1000 Genomes Project) treated with 179 chemicals. A large number of chemicals showed significant toxicity variation, however modest, in toxicity level across populations. Moreover, the calculated in vitro toxicodynamic uncertainty factors (VFd) were in close comparison with previously calculated VFds of 34 chemicals tested in vivo, which supports that incorporating chemical-specific factors for toxicodynamic variability at the population level may provide valuable information for risk assessment. Incorporation of GWAS with cytotoxicity suggested an essential association with SNP rs13120371 in the solute carrier SLC7A11 possess potential connection with inter-individual susceptibility to several chemicals, such as 2-Amino-4-methylphenol, methyl mercuric (III) chloride, and N-methyl-p-aminophenol sulfate-induced cytotoxicity (Abdo et al., 2015b). This study highlighted the evidence of new opportunity of in vitro toxicity testing at population scale with the potential mechanism of inter-individual variation.

A similar cytotoxicity study with two types of complex pesticide mixture in 146 LCL from 4 ancestrally and geographically diverse human populations exhibited a similar range of toxicity with considerable inter-individual variability. The GWAS and basal RNA-seq, in combination with cytotoxicity data, revealed an SNP (rs1947825 in C17orf54) associated with pesticide mixture toxicity and a possible determinant of variability (Abdo et al., 2015a). Moreover, the authors presented a toxicodynamic uncertainty factor (VFd) of around 3-fold, comparable with inter-individual variability for each pesticide mixture, and demonstrated a methodology to calculate in vitro to in vivo extrapolation. This study facilitates the quantitative evaluation of human health hazards by combining in vitro human population-based cytotoxicity screening, dosimetric adjustment, and comparative population genomics analyses.

Induced pluripotent stem cell (iPSC)-derived organotypic population-based in vitro model

Recently, this paradigm expanded beyond the use of immortalized LCL to organotypic and population-based in vitro models, e.g., induced pluripotent stem cell (iPSC)-derived hepatocytes and cardiomyocytes obtained from different donors to address population inter-individual variability of various drugs and chemicals. The studies usually performed kinetic calcium flux and conducted high-content imaging as phenotypes for screening at various doses. Finally, the obtained results were matched with GWAS to identify inter-individual variability (Burnett et al., 2019; Grimm et al., 2019). Inter-individual variability was exhibited higher than 80% of the total variance for tested phenotypes (Ca2+ flux assay, cell viability related phenotypes analyzed through high-content imaging) in population-based iPSC derived cardiomyocytes, 43 individuals of diverse ancestry and sexes, screening with 134 chemicals (pharmaceuticals, industrial and environmental chemicals and food constituents) (Burnett et al., 2019).

These new population-based experimental and computational tools demonstrated the feasibility of quantifying inter-individual variability and exhibited the potential opportunities for improving risk assessment and decision-making by identifying and characterizing the susceptible populations, the level of variability across populations, and the critical biological mechanisms of toxicity and/or susceptibility. In particular, the population-based in vitro toxicity screening followed by dose-response assessment represent a substitute model for quantitative estimation of human toxicokinetic/toxicodynamic variability and thus increase the precision of human health and risk assessment by setting exposure limits with more confidence (Chiu and Rusyn, 2018; Mortensen and Euling, 2013). Besides, the system also fulfills the 21st-century toxicity testing vision (National Research Council, 2007) by using a large-scale in vitro high-throughput screening system.

Numerous extensions of these studies can be envisioned, particularly identifying the determinants of inter-individual (subpopulation) genetic variability and susceptibility in a toxic chemical response. The population-based studies, including these in vitro population variability cell lines, mainly focused on GWAS. However, GWAS-based association studies cannot conclusively assign causality, i.e., low efficiency in detecting and pinpointing the causal genes involved in genotype-phenotype interaction (Rager et al., 2019) and, therefore, combining with other platforms is necessary to elucidate mechanistic relationships more precisely. Genome editing techniques repress and activate targeted genes, can also be applied across the population-based in vitro cell lines to uncover the mechanistic relationships, the causality, between the genetic background of individual (or subpopulation) towards environmental exposure–induced disease by increasing susceptibility or resistance to the disease (Rager et al., 2019; Shen et al., 2015).

The environmental health research field could be immensely benefited through the application of genome editing tools, particularly CRISPR-Cas9, in several ways, such as hazard identification/ranking, toxicity mechanism/adverse outcome pathway (Rager et al., 2019) – the details with example studies are discussed in the next section. Furthermore, the causal genetic determinant of inter-individual variability, albeit of in vitro cell population level, has been characterized by coupling single-cell RNA-seq (scRNA-seq) with the perturbed CRISPR-Cas9 screening platform (Jaitin et al., 2016; Rubin et al., 2019; Dixit et al., 2016).

The CISPR-based genome editing tools and their incorporation in environmental health research

The genome-editing process has been revolutionized with the discovery and application of clustered regularly interspaced short palindromic repeats (CRISPR) system that counts on a bacterial endonuclease (Cas9) and programmed single guide RNA (sgRNA) to cleave at specific DNA sequences (Kampmann, 2017) and enable the targeted gene to be knocked-out entirely from the genome. CRISPR has almost outperformed preceding gene-editing technologies – ZFN, TALEN, RNAi (siRNA/shRNA), etc. It is highly precise that a single gene can be targeted at any point in the genome (Harrison and Hart, 2018; Evers et al., 2016). In addition to CRISPR-based gene knockout (CRSIPRo), transcriptional repressor (CRISPRi) and activator (CRISPRa) platforms are also developed by using catalytically dead Cas9 (dCas9) enzyme. These available platforms enabled genome-wide loss-of-function, a gain of function as well as inducible and reversible repression screening (Gilbert et al., 2014; Kampmann, 2018).

Advancement of the technologies and availabilities of several genome-wide CRISPR libraries - knockout libraries (GeCKO and Brunello libraries) (Sanson et al., 2018; Shalem et al., 2014) as well as transcriptional activator and repressor libraries (Gilbert et al., 2014) enable high-throughput screening in any mammalian cell lines. In addition to genome-scale perturbed libraries, targeted subset libraries, such as pooled and arrayed epigenetic factors (Henser-Brownhill et al., 2017), energy metabolism (Birsoy et al., 2015) libraries also added values in screening applications in biomedical sciences, drug discoveries, host-pathogen interactions, disease etiology, and biomarker identification as well as in therapeutic sciences (Kampmann, 2017, 2018; Baliou et al., 2018; Ramkumar and Kampmann, 2018; Behan et al., 2019; Cao et al., 2018; Han et al., 2018). Accumulating shreds of evidence, however, to a lesser extent, indicated that high-throughput CRISPR-based chemicals screening possess its strong future potentiality in mechanistic toxicology and environmental health studies (Shen et al., 2015; Sobh and Vulpe, 2019).

Genome-wide CRISPRo library (GeCKO) screening identified FTO and MAP2K3 genes as top rank genes in triclosan induced toxicity in human hepatoma (HepG2) cell lines (Xia et al., 2016). This study performed cell viability as phenotype and positive selection method with sensitive dose (IC10 or IC20) and resistant dose (IC50). The immune response pathway confers with triclosan sensitivity while adherens junction, MAPK signaling, and PPAR signaling pathways are involved in resistance mechanisms.

Metabolism-targeted CRISPRo library screening in human T lymphocyte cells (Jurkat) cell lines provided insight into paraquat toxicity (Reczek et al., 2017). The positive selection screen revealed that paraquat-induced cell death required three genes, POR (cytochrome P450 oxidoreductase), ATP7A (copper transporter), and SLC45A4 (sucrose transporter), while specifically POR is involved in paraquat-induced reactive oxygen species formation. At the same time, the negative selection screening uncovered two genes, SLC31A1 (plasma-membrane-bound CTR1 copper importer protein) and SOD1 (copper- and zinc-dependent cytoplasmic antioxidant enzyme), and the role for intracellular copper in mitigating paraquat mediated cytotoxicity.

Genome-wide CRISPRo library (Brunello) screening in organochlorine pesticide-treated human dopaminergic neuronal (SH-SY5Y) cells found common mechanisms of pathogenesis of idiopathic and inherited forms of Parkinson’s disease and specifically, identified molecular mediators of neurodegeneration which could serve as diagnostic biomarkers as well as a potential therapeutic target (Russo et al., 2019).

Recently one group has published two CRISPRo library screening (GeCKO version 2) studies by exploiting viability phenotype and positive selection with/without chemicals treatment in erythroleukemic (K562) cells. One study identified OVAC2 as a determinant of acetaldehyde, a known human carcinogen, induced toxicity (Sobh et al., 2019a). A secondary screening and validation experiment confirmed the role of OVCA2 in DNA adduct (acetaldehyde derived N2-ethylidene-dG) removal and/or DNA repair. Another study reported that multiple genes involved in selenocysteine metabolism were perturbed in arsenic (ASIII) induced toxicity (Sobh et al., 2019b). Besides the selenocysteine metabolism, KEAP1 was also found among the top rank gene, and its disruption increased arsenic tolerance.

Flow-cytometry-based cell sorting was recently exploited as genome-scale knockout (GeCKO) screening phenotype in arsenic mediated endoplasmic stress (ER) induced apoptosis in CHOP-reporter cell lines. With other top-ranked genes (such as L3MBTL2 [L(3)Mbt-Like 2] and MGA [MAX gene associated]), the group identified miR-124-3 directly targets the IRE1 branch of the ER-stress pathway (Panganiban et al., 2019).

Therefore, we came up with the idea of incorporation of genome-wide, whole genome-scale or a subset of genome, CRISPR–based genetic perturbation screening approach in the cell-based epidemiological platforms (such as LCL panel) to identify the determinants of variability and susceptibility of inter-individual (subpopulation) response towards genotype-phenotype relationship (Fig. 1).

Fig. 1

CRISPR-based screens can elucidate determinants of susceptibility by combing with genome-wide association (GWAS) and ethnicity information in a population-based in vitro model.

Proposed strategy based on perturbed CRISPR-based approach and population-based in vitro model system

Here we present a potential study design and strategy by combining pooled perturbed CRISPRo library and cell-based epidemiological model system (LCL available from HapMap and 1000 Genomes project) for functional genetic profiling and identification of candidate genes responsible for population variability and inter-individual susceptibility. The individual cell lines are stably transduced with pooled CRISPRo library (lentiviral packaged) targeting the entire genome or a selected set of genes for the screens. After that, the antibiotic-selected cells are divided into batches of control and chemical exposed ones. Cell viability phenotype and positive selection would be preferred, and after a proper incubation period, sequencing was performed with the remaining cells’ harvested DNA (Fig. 2A). Key considerations before the implementation of the functional screening include (Sobh and Vulpe, 2019) –

​i) ​ ​Establish population-based cell lines after receiving from the source (LCL from HapMap and/or 1000 Genomes project).

ii) ​Cell doubling time determination, which is necessary to decide screening duration

iii) ​Efficient lentiviral transduction – a detailed protocol published recently for CRISPR/Cas9 knockout library functional genomics screening, including transduction, in human lymphoma cell lines (Jiang et al., 2018).

iv) ​The choice of pooled sgRNA library – the library could be either genome-wide (e.g., Brunello, GeCKO), focused available or customized library (e.g., epigenome, metabolic, or toxicology related). In the customized library, it is vital to consider the human genetic variation on CRISPR-based genome editing efficiency during designing sgRNAs (Lessard et al., 2017; Canver et al., 2018).

v) ​A pre-screening is needed to determine the multiplicity of infection (MOI) so that most cells express at most one sgRNA, which might be more-or-less same for all LCL even though the panel was obtained from different people.

vi) ​Chemical selection and exposure dose determination – to this end, the previous report demonstrated many chemicals dose-response might be followed (O'Shea et al., 2011; Lock et al., 2012; Abdo et al., 2015b). A pre-cytotoxicity assay in the LCL panel is necessary if the dose-response is not known for selected chemicals. The low dose of respective chemicals should be chosen for functional genomics screening as susceptible genes are targeted.

Fig. 2

Strategies and workflow of functional genetic screens with pooled CRISPR-based knockout library in population-based in vitro cell lines. A. Pooled genetic screens are carried out by stably transducing each cell line from the panel of population-based cell lines expressing the CRISPR machinery. Subsequent phenotype selections were carried out with/without chemicals treatment after NGS top-ranked genes were determined by their relative enrichment or depletion of the respective sgRNAs in the treated versus control cells. After that, the processes are repeated for the whole LCL panel. B. Data analysis to identify the determinants of the population variability and susceptibility with the entire gene-rank list of the whole LCL panel for a particular chemical treatment. After cell line ranking based on susceptibility, the primary validation of top-ranked genes in respective sensitive cell lines would be conducted.

The primary data analysis would be carried out with either standard protocol (Sobh and Vulpe, 2019) or by following the ‘big data’ analysis workflow performed with several cancer cell lines in a recent study (Behan et al., 2019). Additional data analyses incorporating GWAS data with the top-ranked gene list of each cell line are needed to identify the in silico connection of susceptible genes to inter-individual variability for particular chemical exposure (Fig. 2B). The initial validation phase would typically be performed with the selected gene knockout in respective susceptible cell lines and then whether the gene generates phenotype (cell viability) (Fig. 2B). If the initial validation phase is positive, the subsequent experiments could elucidate the entire sensitivity pathway in susceptible cell lines. The most challenging part is the real-world validation of the identified determinants of susceptibility in epidemiological studies. The available options are – i) data from previously published reports to support the hypothesis and initial validation; ii) need to collaborate with an epidemiological study group to design future validation studies.

Challenges and opportunities

This unbiased population-based functional genomics approach can explore and characterize the functional SNPs, mechanistic biomarkers, and further, can translate the underlying mechanism of cellular response to genetic variability and individual (or subpopulation) susceptibility (or resilience) to environmental factors. Thus, these model systems possess the potential to achieve the data gap about population variability information needed in next-generation risk assessment.

The chemicals that would demonstrate high variability in gene targets and gene-phenotype response could be given higher priority for further study towards validation of susceptible pathway(s) and epidemiological studies for relevant human data.

This strategy holds the promise to shape up future mechanism-based epidemiological studies, environmental diseases associations, and gene-environment interactions, and specific biomarkers. However, the in vitro system, in particular the LCL model, possess intrinsic limitations, such as alterations due to immortalization, no organ-specific physiology, no considerations of other important variables (such as age, lifestyle factors, diet, etc.), metabolic constraint capacity, uncertainty factors in the translation of the results from cell lines to whole human, e.g., cell treatment doses to predict on human dose (Dornbos and LaPres, 2018; Abdo et al., 2015a).

The current state of the in vitro population variability paradigm does not address any non-genetic sources, such as epigenetic factors of variability, which play a pivotal role in an individual’s susceptibility. A perturbed CRISPR-based epigenetic focused library (Henser-Brownhill et al., 2017) could solve this issue and identify the epigenetic determinants with genetic one for inter-individual variations in the gene-environment relationship. Nevertheless, epigenetic regulations and gene expression depend on tissue types; hence, the epigenetic studies in this platform would only be confined to particular studied cell lines until further epidemiological data support or validate those findings.

As this is a population-based platform, it is vital to consider the effects of genetic variations on CRISPR/Cas9 based genome editing efficiency. Single nucleotide polymorphism (SNP), single nucleotide variations (SNV), insertions/deletions (indels) can impact on-and off-target specificity; nonetheless, these concerns are mainly related to therapeutic applications. These hurdles could be overcome by using GWAS information during sgRNAs designing and choosing Cas9 enzyme orthologue (Lessard et al., 2017; Canver et al., 2018; Scott and Zhang, 2017). Using a pre-designed library, we may focus on the ‘validation part’ to confirm the identified candidate gene’s potentiality to anchor with phenotypes.

The inclusion of genetic variability will increase noise (Dornbos and LaPres, 2018), so a better data analysis platform is needed.

The experimental bottle-neck is the labor-intensive handling of many cell lines with pooled genome-wide CRISPR perturbation (establishment, maintenance, transfection, treatment, NGS, and data analysis). A recent study where the author performed genome-scale pooled CRISPRo library (GeCKO) screens in 324 human cancer cell lines from 30 cancer types could be followed as a typical one (Behan et al., 2019). Moreover, applying a focused (pathway-based or disease-specific) CRISPR-based perturbed library instead of a genome-wide one would scale down (Sobh and Vulpe, 2019) and serve as a better ‘mid-throughput’ screening platform.

Currently, it is feasible to establish the platform with LCL from HapMap (O'Shea et al., 2011) and/or 1000 genomes project (Abdo et al., 2015b), which could be expanded to other cell types from genetically and geographically diverse populations or even to population-based iPSC derived cell lines obtained from various donors (Burnett et al., 2019; Grimm et al., 2018, 2019).

In the future, it will be possible to screen with additional phenotypes other than cell viability, such as applying array-based perturbed CRISPR library and high-throughput strategy for analyzing complex cellular phenotypes (Kampmann, 2017; Henser-Brownhill et al., 2017). In addition, the combination of single-cell RNA-seq technique to the current experimental design would find out the effect of gene perturbation on transcriptome (Dixit et al., 2016; Jaitin et al., 2016).

To avoid essential gene disruption, the CRISPRi or CRISPRa libraries could also be a better choice in future studies depending on questions and phenotypes of cell-based epidemiology (Kampmann, 2018).

In summary, we foresee that with all its limitations, this unbiased screening platform possesses immense potentiality to move forward the epidemiological studies, specifically by identifying the determinants of inter-individual or subpopulations’ susceptibility to environmental exposure induced diseases. Moreover, beyond the specific chemicals risk assessment, this type of study could provide rich resources of perturbed gene function for developing public health solutions with informed choices.

ACKNOWLEDGMENT

This work was supported by the ‘100 talent program’ from Govt. of Jiangsu province, China (grant number: SBX2019010069).

Conflict of interest

The authors declare that there is no conflict of interest.

Terms and Definitions

Inter-individual variability: Individual response to environmental stressors and environmental exposure induced diseases. For e.g., some people got sick in the same toxic environment while others remain healthy, governed by genetic and epigenetic factors.

Population-based in vitro model: Collection of cell lines (mainly lymphoblast cell lines) from ancestrally and geographically diverse human populations.

GWAS: Genome-wide association study (GWAS) is the approach that studies genetic variation at the population level, particularly the association with diseases.

CRISPR-Cas9: CRISPR (Clustered Regularly Interspaced Short Palindrome Repeats)- bacterial endonuclease (Cas9) represents a genome editing technology that precisely cleaves specific DNA sequences by a 20-base pair single-guide RNA (sgRNA).

CRISPRo: CRISPR-based genome-wide gene knock-out (i.e., gene deletion) library.

CRISPRi: CRISPR-based genome-wide gene interference (i.e., suppressions of gene expressions) library.

CRISPRa: CRISPR-based genome-wide gene activation (i.e., activation of gene expressions) library.

REFERENCES
 
© 2021 The Japanese Society of Toxicology
feedback
Top