To whom correspondence should be addressed: Fumio Imamoto, Department of Molecular Biology, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka, 565-0871, Japan. Tel: +81–6–6879–8325, Fax: +81–6–6879–8335 Abbreviations: EF-1α, elongation factor-1α; Flp, flipase; FRT, Flp recognition targets; cHS4, chicken β-globin DNaseI hypersensitive site 4; H2B, histone H2B; EB1, end-binding 1; SECFP, super enhanced cyan fluorescent protein; Venus, super enhanced yellow fluorescent protein; mKeima, monomeric Keima; mCherry, monomeric Cherry.

Index
Introduction
Materials and Methods
1. Plasmid construction
2. Cell culture and transfection
3. Microscopy
4. Plasmid rescue and sequence analysis
5. Bioinformatics analysis of the motifs
Results
Genomic integration sites in HeLaS3
Genomic features of φC31 integrase-mediated integration sites
Successive integration of two separate multigene expression cassettes into different sites in a single cell
Discussion
Characterization of HeLaS3 pseudo attP sites
Successive integration of multiple transgenes
Acknowledgements
References

Introduction

An increasing demand for genomic integration of multiple genes into a living cell has arisen in recent years in order to obtain the powerful cellular tools needed in the tissue-regenerative medical treatment and gene therapy platforms. There have been several reports over the past 17 years describing the development of polycistronic viral vectors to transduce two, three or four genes coincidently into eukaryotic cells (De Felipe and Izquierdo, 2000; de Felipe, 2002; Ngoi et al., 2004; Szymczak and Vignali, 2005; Gonzalez-Nicolini et al., 2006). Some commonly used viral vectors have been known to mediate non-specific integration of genes into a variety sites throughout the host genome, but expression of the transgenes is frequently down-regulated due to epigenetic silencing phenomena (Lund et al., 1996; Pannell and Ellis, 2001). Even in the case of successful avoidance of epigenetic silencing, the transgenes may activate a neighboring proto-oncogene, thereby developing, for example, leukemia (Hacein-Bey-Abina et al., 2003). Irrespective of this, the site of integration and the number of integrated genes of interest cannot be predicted using viral systems, which may cause difficulty in predicting the fate of the treated cell.

Site-specific recombination systems are powerful tools for introducing predetermined modifications into genomes of higher eukaryotes. The well-known site-specific recombinases that have been used for this purpose are Cre from P1 (Feng et al., 1999; Abremski et al., 1983; Sauer and Henderson, 1990; Sauer and Henderson, 1988), Flp from S. cerevisiae (O’Gorman et al., 1991) and φC31 integrase from the Streptomyces φC31 phage (Kuhstoss and Rao, 1991; Combes et al., 2002). Unlike the Cre and Flp recombinases, φC31 is a unidirectional integrase that catalyzes integration of transgenes into pseudo attachment sites (pseudo-attP) of the host genome of mammalian cells (Groth et al., 2000) by recombining the attB recognition site in an episomal plasmid and one or more pseudo-attP site(s) in the host chromosomes (Thyagarajan et al., 2001; Olivares et al., 2002). As a result of the recombination reaction, φC31 integrase creates attL and attR sites, which are different sequences from attB and attP and are not substrates for the integrase. This is in contrast to the Cre and Flp recombinases which create two cis-positioned Lox and FRT sites, respectively, in close proximity to each other after the recombination reaction, thereby causing immediate reverse reaction (excision).

Since φC31 integrase catalyzes only integration and the transgene once integrated is never excised, it would be possible to insert successively and duplicately two different plasmids carrying multiple genes into two or more different pseudo attP sites of the host genome. Of three human somatic cell lines, HEK293, HepG2 and D407 (Chalberg et al., 2006), and one human embryonic stem cell line, BG01v, each has been found to contain 17 to 26 frequently targeted genomic pseudo-attP sites (Thyagarajan et al., 2008). Focusing on the HeLaS3 cell line, we found 39 genomic pseudo-attP sites. In this paper, we first report the characterization of the pseudo-attP sites identified in the HeLaS3 genome then the successful duplicate integration of two different multigene expression cassettes with known genomic constitution into two or more pseudo attP sites on the chromosomal DNA, thus demonstrating the stable integration of four different functional cDNAs at definite genomic sites and robustly long-term expression of the multiple transgenes on the chromosomes in a single cell.


Materials and Methods

1. Plasmid construction

The expression clones (pEXPR) were constructed using the Multisite GatewayTM system, essentially as described previously (Sasaki et al., 2004; Cheo et al., 2004; Sone et al., 2008). The plasmid pB2H1-DEST has been described previously (Thyagarajan et al., 2008). To construct the plasmid pFRT/Bla-phiC31attB, pUC-Bla-phiC31attB and pFRT/blasticidin (Yahata et al., 2005) were digested with NdeI and NruI (3199 bp and 843 bp, respectively) and ligated. To construct the Destination vector pEF/B2B1/V5-DEST, firstly the EF-1α promoter DNA (3139 bp) was prepared from pEF-DEST51 (Invitrogen Corp.) after digestion with MluI, and secondly pUC-Bla-phiC31attB was prepared by ligation of self-ligated circular pUC-Bla vector treated with BglII and ScaI digestion (3870 bp) with pB2H1-DEST digested with BamHI and ScaI (1223 bp), and then pUC-Bla-phiC31attB digested with MluI, was ligated with the EF-1α promoter DNA (3139 bp) to construct pEF/B2B1/V5-DEST. Construction of pEF/B2H1/SV-DEST was carried out by ligation of pB2H1/SV-DEST with EF-1α promoter DNA obtained by digestion of pEF5/FRT/V5-DEST (Invitrogen Corp.) with KpnI and NruI. pB2H1/SV-DEST was constructed by ligation of pB2H1-DEST with SV40 poly(A) DNA obtained by PCR of pEF5/FRT/V5-DEST (Invitrogen Corp.).

The plasmid pB2H1-NURP was cloned as follows. The plasmid pB2H1-Z1 was generated by cloning in a filled-in RsrII-EcoRI fragment containing the Zeo ORF and SV40pA from pCMV-Zeo into the filled-in SalI site of pB2H1 (Thyagarajan et al., 2008). The plasmid pB2H1Z1U was generated by cloning in a BamHI-XbaI fragment containing the UMS sequence (Wohlgemuth et al., 1996) into pB2H1-Z1 restricted with BamHI and XbaI. The plasmid pB2H1-Z1UR4P was generated by cloning in a PCR-amplified fragment containing the R4 attP site (Olivares et al., 2001) into pB2H1-Z1U restricted with XbaI and XhoI. This plasmid was restricted with HindIII and XbaI and a PCR-amplified fragment from pcDNA3.1(+) (Invitrogen Corp.) containing the Neo ORF and SV40 poly(A) was cloned into this site to generate pB2H1-NURP.

The expression clones, pEXPR-PEF1α-Mit/mKeima-2xcHS4-PEF1α-EGFP-pm, pEXPR-PEF1α-H2B-SECFP-2xcHS4-PEF1α-mCherry-β-tubulin and pEXPR-PEF1αH2B-SECFP-PEF1α-EB1-Venus-1xcHS4-PEF1α-mCherry-β-tubulin were constructed by LR reaction with two types of the Destination vectors, pEF/B2B1/V5-DEST and pEF/B2H1/SV-DEST, and four to eight types of the Entry clones (Table III). pENTR-L1-pm-*L2, pENTR-L1-SDK-H2B-R3, pENTR-L1-SDK-Mit/mKeima-L4, pENTR-L3-SDK-EGFP-R1, pENTR-L3-SECFP-*L6, pENTR-L3-β-tubulin-*L2, pENTR-R4-pABGH-2xcHS4-PEF1α-R3, pENTR-R4-Venus-*L6, pENTR-L5-PEF1α-R3, pENTR-L5-SDK-EB1-L4, pENTR-L5-SDK-mCherry-R3, pENTR-R6-pABGH-2xcHS4-PEF1α-R5 were constructed by Gateway cloning of attB-PCR fragments amplified from cDNA plasmids of Mit (the 29 N-terminal amino acids of cytochrome c oxidase subunit VIII) /mKeima (Kogure et al., 2006), pm (the 20 C-terminal amino acids of K-Ras) (Mochizuki et al., 2001), mCherry (Shaner et al., 2004), H2B (histone H2B, human) (Kanda et al., 1998), EB1 (end-binding 1, human) (Piehl and Cassimeris, 2003), and β-tubulin (Xenopus). Construction of expression clones harboring chicken β-globin HS4 insulator (cHS4) cassettes was carried out with the entry clones described previously (Yahata et al., 2007). Construction of the Entry clones of EGFP, SECFP and Venus were described previously (Sone et al., 2008). The PCR primers used were as follows: Mit/mKeima: (FW; 5'-GGCTTCGAAGGAGATAGAACCATGTCCGTCCTGACGCCG-3' RV; 5'-GGGGCAACTTTGTATAGAAAAGTTGTCAACCGAGCAAAGAGTGGC-3'), pm: (FW; 5'-GGGGCAAGTTTGTACAAAAAAGCAGGCTTCTTCAAGATGAGCAAGGAC-3' RV; 5'-GGGGCCACTTTGTACAAGAAAGCTGTTACATGATCACGCACTT-3'), H2B: (FW; 5'-GGCTTCGAAGGAGATAGAACCatgccagagccagcgaagtc-3' RV; 5'-GGGGCAACTTTATTATACAAAGTTGcttagcgctggtgtacttgg-3'), mCherry: (FW; 5'-GGGGCAACTTTGTATACAAAAGTTGCTTGTACAGCTCGTCCATGC-3' RV; 5'-GGGGCAACTTTATTATACAAAGTTGCTTGTACAGCTCGTCCATGCC-3'), EB1: (FW; 5'-GGGGCAACTTTGTATACAAAAGTTGatggcagtgaacgtatactcaa-3' RV; 5'-GGGGCAACTTTGTATAGAAAAGTTGatactcttcttgctcctcctg-3'), β-tubulin: (FW; 5'-GGGGCAACTTTGTATAATAAAGTTGATGAGGGAAATCGTGCAC-3' RV; 5'-GGGGCCACTTTGTACAAGAAAGCTGTTAGGCATTTTCCTCCTCTT-3').

2. Cell culture and transfection

Cells were incubated in DMEM (Dulbecco’s modified Eagle’s minimum essential medium; Invitrogen Corp.) supplemented with 10% fetal bovine serum (FBS; Equitec-Bio, Inc.) at 37°C 5% CO2. Transfection of HeLaS3 cells with the expression clones was carried out using LipofectamineTM and PlusTM Regents (Invitrogen Corp.) according to the manufacture’s instructions. Co-transfection with the expression clones, pEXPR-series described in the above section, and a φC31 integrase-expression clone, pCMV-φC31Int, was carried out at a mass ratio of 1:1. After culturing in the 6-well plate for 24 hr, the approximate number of the transiently transformed cells were measured with fluorescence microscopy and then the cells were divided among 10 cm plates and cultured. After 48 hr of transfection, cells were selected in medium containing 2 μg/ml of blasticidin S HCl (Invitrogen Corp.) or 200 μg/ml of hygromycin B (Invitrogen Corp.). Selection continued for 10 to 14 days, at which time colonies became visible. Individual colonies were picked with a pipette tip and transferred to individual wells of a 48-well plate. Surviving colonies were expanded for stock and plasmid rescue.

3. Microscopy

For the ordinary observation of transient and stable transformants, a fluorescence microscope (Eclipse TE2000-E; Nikon) with filter sets for CFP/YFP/Cy5 (86008v1; Chroma Technology Corp.), BEF/GFP/DsRed (86009; Chroma Technology Corp.) was used.

4. Plasmid rescue and sequence analysis

The genomic DNA was isolated with the Wizard® SV Genomic DNA Purification System (Promega). Between 10 μg to 20 μg of genomic DNA was digested with NheI, SpeI and XbaI, (TAKARA) that did not cleave within the attB donor plasmid used. DNA was then extracted with phenol/chloroform, precipitated in ethanol, and ligated under dilute conditions with bacteriophage T4 DNA ligase concentrations. After overnight incubation at 16°C, the DNA was precipitated in ethanol and resuspended in water. Electrocompetent (ElectroMAX DH10B Cells; Invitrogen Corp.) were then electroporated with the ligated DNA using the Bio-Rad Gene Pulser Xcell (BIO-RAD Corp.) using recommended conditions. The resulting transformation was plated on Bacto-Agar plates containing ampicillin. Plasmid DNA was prepared from resulting colonies and sequenced with primers (ChoSeqR: 5'-TCCCGTGCTCACCGTGACCAC-3') as described (Chalberg et al., 2006). Sequence data were analyzed using Sequencher software (Gene Codes Corp.). The genomic integration site was determined by matching the sequence read to the database at BLAT (http://genome.ucsc.edu/).

5. Bioinformatics analysis of the motifs

Analysis of the 39 pseudo site sequences rescued in this study was carried out by the web-based MEME motif finder (http://meme.sdsc.edu/meme.html). This program was utilized to find motif ranging from 20–40 bp in 100 bp of sequence surrounding the point of cross-over. The wild type φC31 attP site was also included in the analysis. A common motif was discovered in all the pseudo sites.


Results

Genomic integration sites in HeLaS3

To identify the pseudo attP sites for φC31 integrase in HeLaS3 genome, we used recombinant plasmid vectors bearing attB sites with different DNA sizes which contained one, two or three different cDNA expression cassettes in a single plasmid. The recombinant vectors carrying one cDNA expression cassette are pB2H1-NURP and pFRT/Bla-phiC31attB with 4-7 kb sizes, the vectors carrying two cDNA expression cassettes are pEXPR-PEF1α-mit/mKeima-2xcHS4-PEF1α-EGFP-pm and pEXPR-PEF1α-H2B-SECFP-2xcHS4-PEF1α-mCherry-β-tubulin with 12-14 kb sizes, and the vector carrying three cDNA expression cassettes is pEXPR-PEF1α-H2B-SECFP-PEF1α-EB1-Venus-1xcHS4-PEF1α-mCherry-β-tubulin with the size of 16 kb. HeLaS3 cells were transfected with plasmid containing tandem mit/mKeima cDNA and EGFP-pm cDNA expression cassettes, (upstream mKeima cDNA tagged with mitochondrial-localization sequence (mit) and the downstream EGFP cDNA tagged with plasma membrane-localization sequence (pm)) along with a plasmid expressing φC31 integrase. Here, both reporters were simultaneously introduced into the HeLaS3 genome via action of φC31 integrase at particular pseudo attachment sites (Fig. 1A). This method led to stoichiometric integration of the two fused genes without gene dosage variation, thus leading to successful coordinate expression levels with the cDNAs as reported previously (Yahata et al., 2007). To mitigate the possibility of mutual transcriptional interference occurring between two of the same regulatory elements (EF-1α promoters), two tandem cHS4 insulators were inserted between the two reporter cDNAs (Yahata et al., 2007). In a similar experiment where cells transfected with the plasmid containing tandem three cDNA expression cassettes carrying H2B-SECFP, EB1-Venus and mCherry-β-tubulin, three fluorescently tagged proteins were produced at nearly equivalent levels by inserting an insulator cHS4 unit between the second and third downstream cDNAs (Sone et al., 2009) (Fig. 1B). These tandem two and three cDNA expression clones which exhibit well balanced expression on the genome were used as good tools to characterize the φC31 integrase system in the present work.


View Details
Fig. 1.
Expression profiles of the tandem two and three gene expression clones integrated into pseudo attP sites in the HeLaS3 genome. The profiles of the stably transfected cells harboring a tandem two gene expression clone, pEXPR-PEF1α-Mit/mKeima-2xcHS4-PEF1α-EGFP-pm, and a tandem three gene expression clone, pEXPR-PEF1α-H2B-SECFP-PEF1α-EB1-Venus-PEF1α-mCherry-β-tubulin, are shown in (A) and (B), respectively. Cells were analyzed under a fluorescence microscope without fixation. The former tandem two gene expression clone (A) was integrated at 1p22.3 and 12q22, and the latter tandem three gene expression clone (B) was integrated at 1q22. The white scale bar is 20 μm. The other conditions and descriptions are as given in the text and Materials and Methods.


HeLaS3 cells were transfected with the three reporter cDNA expression constructs described above in the presence or absence of φC31 integrase expression plasmid then cultured in the presence of blasticidin or hygromycin (used for duplicate transfection of two constructs). After two weeks of blasticidin or hygromycin selection with the cells co-transfected with the integrase plasmid, the efficiency of drug-resistant reporter-positive colony production was approximately 3.0×10–4 (76 colonies from 2.5×105 transfected cells). The efficiency decreases from 5.2 to 0.9×10–4 as the plasmid size increases from 4 to 16 kb. Examination of co-transfection of three different sized multigene constructs with φC31-integrase cDNA resulted in the following trend: 1) the relative frequency of yield of blasticidin- or hygromycin-resistant colonies was 5.2, 2.0 and 0.9×10–4 transfected cells for one, two and three cDNA expression constructs with 4–7, 12–14 and 16 kb, respectively, 2) the true positive stably transfected colonies obtained were 45, 17 and 14 colonies for one, two and three cDNA constructs, respectively, and 3) the integration events at pseudo attP sites were 34, 16 and 5 for one, two and three cDNA constructs, respectively (Table I). These results suggest that the frequency of φC31-mediated integration is greater as the plasmid size decreases and the possible occurrence of aberrant events in the transgene cassette during DNA recombination is not significantly related to increasing the construct size.



In either case, the average increase in the positive transformant colonies was 4-fold over random integration without integrase. These results suggest that φC31 integrase has mediated site-specific integration of the recombinant expression plasmids bearing attB into pseudo attP sites in HeLaS3 chromosome. To investigate whether the colonies obtained were the result of φC31-mediated integration, the site of integration was determined using plasmid rescue and sequencing (Chalberg et al., 2006). We examined 35 colonies and identified 55 distinct integration events (some cells had multiple events). These events represented integration at 39 different sites since some sites were identified multiple times. As shown in Table I, 34 integration events were with the plasmids carrying a single cDNA expression cassette, 16 events were with the plasmids of the two cDNA expression cassette and 5 events were with the plasmid of the three cDNA expression cassette. These integrated DNA constructs maintained their pre-transfection configuration and exhibited the expected expression profiles of their respective transgenes. With regard to cells transfected with multi-cDNA constructs, approximately 10% of the blasticidin- or hygromycin-resistant colonies were reporter-positive stable transfectants containing intact multiple cDNA cassettes. The remaining colonies lacked part of the expected fluorescent profiles and presumably possessed some defect such as partial deletions or rearrangements of the cDNA cassette.

Chromosomal loci 19q13.31 and 12q22 were most frequently used for φC31-integration with the three types of the multi-cDNA constructs described above. Of the 55 integration events tested, 7 were identified at 19q13.31 and 5 at 12q22. On the other hand, 36 of the integration events observed showed duplicate integrations distributed among 27 different integration sites (Table II). Eleven transformant colonies obtained with the single cDNA expression cassettes (Table II; Vectors A and B) were the cases of simultaneous integration at two or three sites in the genome. Four transformant colonies with the tandem two cDNA expression cassettes (Table II; Vectors C and D) were the cases of genomic integration at two or four sites. While one transformant colony containing the tandem three cDNA cassettes (Table II; Vectors E) exhibited simultaneous integration at two sites in the genome. As shown in the right column of Table II, the single integration sites were observed with 3 and 6 transformant colonies carrying A and B vectors, respectively, and 2 and 3 colonies harboring C and D vectors, respectively. Three colonies with the three cDNA cassettes (E) exhibited a single site-integration. It is unlikely that the observed recurrent integration into multiple sites resulted from a mixture of separate independent colonies since the sequenced plasmids obtained by rescuing were derived from a single colony purified after four to five passages through approximately two to three weeks in culture. Although the cDNA expression cassettes were integrated at those multiple sites at the same time in a cell, the overall expression levels were not significantly increased, remaining at the level of a single site integration (data not shown).







Genomic features of φC31 integrase-mediated integration sites

The existence of sequence motifs defined as φC31 pseudo attP sites has been hypothesized by identifying DNA sequence in the somatic cell lines, HEK293, HepG2, D407, and two stem cell lines such as BG01V and SA002 (Thyagarajan et al., 2001; Chalberg et al., 2006; Thyagarajan et al., 2008). To elucidate and characterize φC31 integrase-mediated integration sites in HeLaS3 cells, we analyzed the DNA sequences of the integration sites bioinformatically by using MEME software which is designed to discover DNA sequence motifs in biopolymers (Bailey and Elkan, 1994). In order to search for motifs among the 39 sites which have been identified for φC31 integrase-mediated integration, the 100 bp of genomic DNA sequence surrounding each crossover site was obtained from the UCSC Human Genome Database (March 2006 assembly) and processed with MEME, along with 100 bp of surrounding attP. Significant 33 bp length motifs were found in 33 sequences of 100 bp surrounding φC31-integration sites and are indicated in Fig. 2A by setting the crossover site at 51 bp. In almost all cases, the motif was found in the expected region near the observed crossover points. In the alignment of the motif sequences of the 33 sites shown in Fig. 2B, the 28 bp sequences contain inverted repeats which create a partial palindromic structure, seen usually at site-specific crossover sites such as att sites where each integrase molecule is expected to bind at each half-site (Smith and Thorpe, 2002). These results support the conclusion that the φC31 integrase-mediated integration sites observed in Table I are pseudo attP sites similar to those reported previously with cells such as HEK293, HepG2, D407 and BG01V (Thyagarajan et al., 2001; Chalberg et al., 2006; Thyagarajan et al., 2008). As shown in Table I, 26 pseudo attP sites are present in intergenic regions and 13 sites are in introns of genes.


View Details
Fig. 2.
DNA sequence features of φC31 hotspots. φC31 integrase-mediated pseudo sites identified in HeLaS3 were analyzed along with the native φC31 attP for the presence of a common motif by using the MEME motif finder to analyze 100 bp of genomic DNA surrounding the observed crossover site. (A) A block diagram indicating the positions of the motif discovered. The 33 bp attP motif appeared in all 33 of the included sequences close to the area of observed crossover (indicated by the 50 bp midpoint of the sequence). (B) The multilevel consensus sequence derived from the motif shown in (A). The consensus is symmetrical about the core and contains inverted repeats extending over the length of the consensus, indicated by arrows. The other descriptions are as given in the text.


Successive integration of two separate multigene expression cassettes into different sites in a single cell

As shown in Table II, 16 φC31-mediated stably transfected colonies exhibited integration into multiple pseudo attP sites in the genome. This opens the possibility of duplicate integration of two different multigene expression constructs into different pseudo attP sites. Recently, the demand for development of multigene delivery technology into living cells has been increasing, particularly for those technologies using site specific genomic integration (Sorrell and Kolb, 2005; Yamanaka, 2008). φC31 integrase-mediated integration has been suggested as one of the most useful non-viral gene delivery systems for this application (Calos, 2006). Using multiple single cDNA donor vectors to introduce two to three different genes simultaneously into a site or sites in the cellular chromosomal DNA would be difficult. On the other hand, this would be easily achieved using a multigene vector carrying two to three cDNA expression cassettes (Fig. 1). However, the construction of the four to five expression cassette vector would be more difficult mainly due to the large construct size and difficulty in achieving well-balanced expression from the multiple transgenes situated in close proximity to one another in the genome (de Felipe, 2002).

As mentioned above, the φC31 integrase reaction is unidirectional leading to stable integration of a multi-gene cassette inserted at a pseudo attP site. Further, the site cannot accept an additional attB construct. Thus, a second multigene construct can be introduced into the other pseudo attP site(s) among a set of the sites existing in the genomic DNA. By successive integration of two different constructs containing tandem two fused cDNAs, we succeeded to introduce four cDNAs fused with different colored fluorescent protein tags into three sites on the HeLaS3 chromosomal DNA. The HeLaS3 transfected clone harboring constructs with tandem fused cDNAs (pEXPR-PEF1α-mit/mKeima-2xcHS4-PEF1α-EGFP-pm) at the 1p22.3 and 12q22 sites was additionally transfected with another tandem two cDNA-carrying construct (pEXPR-PEF1α-H2B-SECFP-2xcHS4-PEF1α-mCherry-β-tubulin). The pseudo attP site where the second construct was introduced was mapped at 4q25, thus indicating the successful integration of four different genes at known chromosomal loci in a single cell (Fig. 3).


View Details
Fig. 3.
Successive integration of two different multigene expression clones into the HeLaS3 genome. A presentation of the expression plasmids carrying tandem mit/mKeima (white) and EGFP-pm (black) with 2xcHS4 cassettes (gray shaded) and H2B-SECFP (white) and mCherry-β-tubulin (black) with 2xcHS4 cassettes (gray shaded) is shown on the top or bottom. The former expression clone was integrated at 1p22.3 and 12q22, and the latter clone was integrated at 4q25. The cells were analyzed under a fluorescence microscope without fixation. The white scale bar is 20 μm. The other conditions and descriptions are as given in Fig. 1, text and Materials and Methods.


The foregoing experiments demonstrate that two different multigene constructs were efficiently introduced into the HeLaS3 genome using φC31 integrase. The analysis of the plasmids rescued from the integration sites showed that integration of a plasmid bearing attB into a chromosomally located pseudo attP site is invariably precise, yielding the expected recombination event at the DNA sequence level. An advantage of the φC31 integrase-mediated recombination system over the conventional site-specific recombination systems, such as Flp/FRT and Cre/Lox, is the ability to map the precise genomic position where the transgene is integrated. Usually, the resituated FRT or Lox site inserted into unknown location of the genome by random integration is difficult to determine. Fig. 4 indicates the chromosomal location of a tandem gene expression cassette (pEXPR-PEF1α-mit/mKeima-2xcHS4-PEF1α-EGFP-pm) integrated at a pseudo attP site at 12q22 which is intergenic. Those transgenes introduced into HeLaS3 genomic sites were quite stable and robustly expressing throughout 40 passages over four months and also even in refreshed culturing after long-term storage at –80°C or in liquid nitrogen (data not shown), consistent with the previous report (Chalberg et al., 2005).


View Details
Fig. 4.
Genomic map of the integration site of a tandem two cDNA expression clone. The pseudo attP site at 12q22 where the tandem two gene expression clone, pEXPR-PEF1α-Mit/mKeima-2xcHS4-PEF1α-EGFP-pm, is integrated is shown. The expression profile of the clone is indicated in Fig. 1A. The other interpretations are as given in the text.



Discussion

Characterization of HeLaS3 pseudo attP sites

In the present study, we identified 39 pseudo attP sites in the HeLaS3 genome that can be used for φC31 integrase-mediated site-specific recombination. While there have been some reports of chromosomal abnormalities (rearrangement, translocation, deletion, insertion) in the host genome surrounding the integrated sites after φC31-mediated recombination, these generally occur at low frequency (Thyagarajan et al., 2001; Ehrhardt et al., 2006). In order to identify the 39 pseudo attP sites reported here, we determined the upstream flanking DNA sequences (600–700 bp) from the attR-genome junction by cycle sequencing with the plasmids rescued from 70 drug resistant colonies. About 50% of the DNA sequences from the colonies bear a short stretch of mismatches, deletions, or other defective events compared with the BLAT database (http://genome.ucsc.edu/). A portion of those may be due to an aberrant chromosomal event which occurred during recombination by φC31 integrase. We selected 55 independent integration events derived from 35 drug resistant colonies whose flanking sequences matched exactly to that of the database. While a portion of these 55 events might involve some chromosomal rearrangements in the downstream flanking DNA sequences at the attL-genome junction, it would most likely be at a frequency less than 7.5% (Ehrhardt et al., 2006). The 55 integration events were distributed among 39 different pseudo attP sites, including 36 integration events which are recurrent integrations among 27 different “hot spot” pseudo attP sites (Table II).

With respect to identification of pseudo attP sites, the human somatic cell lines such as HEK293, D407 and HepG2 (Chalberg et al., 2006) and the human embryonic stem cell lines, BG01v and SA002 (Thyagarajan et al., 2008), have been well studied. The 39 pseudo attP sites of HeLaS3 found in the present study show similarity to the native φC31 attP and share a common motif containing an inverted repeat (Fig. 2B), consistent with previous reports (Chalberg et al., 2006; Thyagarajan et al., 2008). The motifs identified in the 39 integration sites are located close to the crossover site, suggesting involvement in the integrase-mediated recombination (Fig. 2A). Of the 39 pseudo attP sites identified in HeLaS3, 6 sites (19q13.31, 10q21.2, Xq22.1, 2q31.1, 17q25.1-1 and 17q25.1-2) are common to HEK293 cells (Chalberg et al., 2006), and 2 sites (12q22 and 11q23.3) are also found in the human embryonic stem cell lines (Thyagarajan et al., 2008). Further, of the 39 pseudo attP sites identified in this study, 9 sites were used recurrently, particularly 19q13.31 and 12q22 were most frequently used (7 and 5 integration events, respectively) (Table I). The 19q13.31 site has also been reported as the most favored unique sequence for φC31-integrase in three other somatic cell lines (Chalberg et al., 2006).

Successive integration of multiple transgenes

In the present study, we used five multigene expression constructs that harbor two or three tandemly situated cDNAs tagged with different fluorescent protein cDNAs in a single plasmid. By varying the strength of promoters used for transcribing the cDNAs, we determined that these constructs are useful for optimizing relative gene expression levels to pre-determined and different levels (Yahata et al., 2005). As shown in Fig. 1 and Fig. 3, tandem two and three gene expression cassettes integrated by φC31 integrase, whose transcription was directed by the same EF-1α promoter, resulted in robust and comparable expression levels of each transgene. This is due, in large part, to the cHS4 insulators inserted between cDNAs restoring expression since the expression of tandem transgenes is often subject to mutual transcription interference by each of the regulatory elements situated in close proximity to one another (Kadesch and Berg, 1986; Proudfoot, 1986). Although the chromosomal transgenes integrated by φC31-mediated recombination have been reported to maintain their robust expression without suffering epigenetic silencing phenomena (Calos, 2006; Chalberg et al., 2005) (this observation), the cHS4 elements inserted in the multi-gene construct are assumed to play a role in ensuring further effective expression by facilitating active chromatin conformation around the transgenes (Yahata et al., 2007), recruiting CTCF (CCCTC-binding factor) and histone modifiers such as USF1 (upstream stimulatory factor 1) cooperatively (Yahata et al., 2007).

The multigene constructs used in this study are useful tools to introduce multiple genes simultaneously into the genome. The multigene-carrying plasmids ensure that all cDNAs are present at the same chromosomal locus in stoichiometric amounts without gene dosage variation. By combining this with φC31-integration technology, we now show that stably transfected cells harbor two or three cDNAs at a definite site on the chromosome, and the integrated site can be found precisely according to the plasmid rescue strategy (Fig. 4). Site-specific recombinases and integrases have provided attractive tools for genomic integration of genes and targeted modification of mammalian genomes (Sorrell and Kolb, 2005). Unlike Cre and Flp recombinases which depend on presetting of the target loxP and FRT signals respectively onto the genome by inefficient random integration, φC31-integrase acts at chosen native pseudo attP sequences within unmodified mammalian chromosomes. Moreover, φC31 is a unidirectional integrase that only supports integration, in contrast to reversible reaction of Cre and Flp recombinases. As we increase our knowledge of multigene diseases and with an eye on development of gene therapy treatments, technologies such as those described in this study would be useful for genomic introduction of multiple genes and their well-balanced co-expression. Viral vectors capable of expressing multiple genes have been explored and have met with several challenges (de Felipe, 2002; Szymczak et al., 2004; Chinnasamy et al., 2006). By employing successive φC31-mediated integration, we introduced two separate two gene expression constructs into different chromosomal loci; for example, one tandem two gene cassette at 1p22.3 and 12q22 sites and another tandem two cDNA-expressing cassette at 4q25 site, thereby constructing four transgenes being expressed from the chromosome of a single HeLaS3 cell (Fig. 3). Four proteins synthesized from these two different transgene cassettes exhibited multi-color images in a cell of their intrinsic intracellular localization as predicted. By successive integration of two different expression constructs containing tandem two and three fused cDNAs, we succeeded in introducing five different cDNAs fused with different colored fluorescent protein tags into sites on the HeLaS3 chromosomal DNA and observed robust expression of the five fused cDNAs by multi-color spectra imaging performed using the 32 channels of the LSM 710 META system (Carl Zeiss) (data not shown), i.e., the HeLaS3 transfected clone harboring constructs with tandem fused cDNAs (pEXPR-PEF1α-mit/mKeima-2xcHS4-PEF1α-EGFP-pm) at the 1p22.3 and 12q22 sites was additionally transfected with another tandem three cDNA-carrying construct (pEXPR-PEF1α-H2B-SECFP-PEF1α-Tau/Venus-2xcHS4-PEF1α-golgi-mKO). However, we could not identify the integration site of the second construct by the plasmid rescue technology used. It might be possible to identify the site by employing the oligo-primers specific for the vector backbone of the second construct.

The utility of this φC31-mediated integration system is enhanced by combining it with the ability to create the desired multigene expression vector systems to deliver the necessary multiple genes into living cells. The present study demonstrates that φC31-mediated integration of plasmids containing multiple cDNAs has made it possible to introduce multiple heterologous genes with known genomic constitution into definite chromosomal loci. The transgenes thus introduced, exhibit well-balanced co-expression by inserting cHS4 insulators between cDNAs, which improves further the long-term performance of the transgene expression by protecting these transgenes from chromosomal position effects (Burgess-Beusse et al., 2002). We are working further to position cHS4 insulator elements at both sides of the transgenes in the hope that the elements will act as a barrier, thereby preventing surrounding genome effects, such as proto-oncogenes, from insertional activation at a strong promoter (Yahata et al., 2007). The multigene-expressing cell clones thus constructed by φC31-mediated integration, we think, will benefit various biological and medical applications in the post-genomic era.

Acknowledgements

The authors are grateful to Dr. Michele P. Calos for providing pCMV-φC31Int and pTA-attB, Dr. Roger Y. Tsien for providing mCherry, and Drs. Masatoshi Takagi and Naoko Imamoto, Riken (Japan) for H2B, EB1 and α-tubulin cDNA plasmids. This work was supported in part by a Grant-in-Aid for Scientific Research from The New Energy and Industrial Technology Development Organization, Japan (NEDO). This work was carried out at the Invitrogen Corporation-Endowed Laboratory and financially supported in part by a fostering fund of Invitrogen Co. on the development of Multisite Gateway DNA cloning system. Gateway®, Max Efficiency and Library Efficiency are registered trademarks of Invitrogen Corp. Clonase, pDONR, DH10B, ElectroMAX DH10B, DB3.1, pENTR, pDEST, Lipofectamine and Plus Reagents are trademarks of Invitrogen Corp.


References
Abremski, K., Hoess, R., and Sternberg, N. 1983. Studies on the properties of P1 site-specific recombination: evidence for topologically unlinked products following recombination. Cell, 32: 1301–1311.
Bailey, T.L. and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol., 2: 28–36.
Burgess-Beusse, B., Farrell, C., Gaszner, M., Litt, M., Mutskov, V., Recillas-Targa, F., Simpson, M., West, A., and Felsenfeld, G. 2002. The insulation of genes from external enhancers and silencing chromatin. Proc. Natl. Acad. Sci. USA, 99 Suppl., 4: 16433–16437.
Calos, M.P. 2006. The φC31 integrase system for gene therapy. Curr. Gene Ther., 6: 633–645.
Chalberg, T.W., Genise, H.L., Vollrath, D., and Calos, M.P. 2005. φC31 integrase confers genomic integration and long-term transgene expression in rat retina. Invest Ophthalmol Vis. Sci., 46: 2140–2146.
Chalberg, T.W., Portlock, J.L., Olivares, E.C., Thyagarajan, B., Kirby, P.J., Hillman, R.T., Hoelters, J., and Calos, M.P. 2006. Integration specificity of phage φC31 integrase in the human genome. J. Mol. Biol., 357: 28–48.
Cheo, D.L., Titus, S.A., Byrd, D.R., Hartley, J.L., Temple, G.F., and Brasch, M.A. 2004. Concerted assembly and cloning of multiple DNA segments using in vitro site-specific recombination: functional analysis of multi-segment expression clones. Genome Res., 14: 2111–2120.
Chinnasamy, D., Milsom, M.D., Shaffer, J., Neuenfeldt, J., Shaaban, A.F., Margison, G.P., Fairbairn, L.J., and Chinnasamy, N. 2006. Multicistronic lentiviral vectors containing the FMDV 2A cleavage factor demonstrate robust expression of encoded genes at limiting MOI. Virol. J., 3: 14.
Combes, P., Till, R., Bee, S., and Smith, M.C. 2002. The streptomyces genome contains multiple pseudo-attB sites for the φC31-encoded site-specific recombination system. J. Bacteriol., 184: 5746–5752.
De Felipe, P. and Izquierdo, M. 2000. Tricistronic and tetracistronic retroviral vectors for gene transfer. Hum. Gene Ther., 11: 1921–1931.
de Felipe, P. 2002. Polycistronic viral vectors. Curr. Gene Ther., 2: 355–378.
Ehrhardt, A., Engler, J. A., Xu, H., Cherry, A.M., and Kay, M.A. 2006. Molecular analysis of chromosomal rearrangements in mammalian cells after φC31-mediated integration. Hum. Gene Ther., 17: 1077–1094.
Feng, Y.Q., Seibler, J., Alami, R., Eisen, A., Westerman, K. A., Leboulch, P., Fiering, S., and Bouhassira, E.E. 1999. Site-specific chromosomal integration in mammalian cells: highly efficient CRE recombinase-mediated cassette exchange. J. Mol. Biol., 292: 779–785.
Gonzalez-Nicolini, V., Sanchez-Bustamante, C.D., Hartenbach, S., and Fussenegger, M. 2006. Adenoviral vector platform for transduction of constitutive and regulated tricistronic or triple-transcript transgene expression in mammalian cells and microtissues. J. Gene Med., 8: 1208–1222.
Groth, A.C., Olivares, E.C., Thyagarajan, B., and Calos, M.P. 2000. A phage integrase directs efficient site-specific integration in human cells. Proc. Natl. Acad. Sci. USA, 97: 5995–6000.
Hacein-Bey-Abina, S., Von Kalle, C., Schmidt, M., McCormack, M.P., Wulffraat, N., Leboulch, P., Lim, A., Osborne, C.S., Pawliuk, R., Morillon, E., Sorensen, R., Forster, A., Fraser, P., Cohen, J. I., de Saint Basile, G., Alexander, I., Wintergerst, U., Frebourg, T., Aurias, A., Stoppa-Lyonnet, D., Romana, S., Radford-Weiss, I., Gross, F., Valensi, F., Delabesse, E., Macintyre, E., Sigaux, F., Soulier, J., Leiva, L.E., Wissler, M., Prinz, C., Rabbitts, T.H., Le Deist, F., Fischer, A., and Cavazzana-Calvo, M. 2003. LMO2-associated clonal T cell proliferation in two patients after gene therapy for SCID-X1. Science, 302: 415–419.
Kadesch, T. and Berg, P. 1986. Effects of the position of the simian virus 40 enhancer on expression of multiple transcription units in a single plasmid. Mol. Cell Biol., 6: 2593–2601.
Kanda, T., Sullivan, K.F., and Wahl, G.M. 1998. Histone-GFP fusion protein enables sensitive analysis of chromosome dynamics in living mammalian cells. Curr. Biol., 8: 377–385.
Kogure, T., Karasawa, S., Araki, T., Saito, K., Kinjo, M., and Miyawaki, A. 2006. A fluorescent variant of a protein from the stony coral Montipora facilitates dual-color single-laser fluorescence cross-correlation spectroscopy. Nat. Biotechnol., 24: 577–581.
Kuhstoss, S. and Rao, R.N. 1991. Analysis of the integration function of the streptomycete bacteriophage φC31. J. Mol. Biol., 222: 897–908.
Lund, A.H., Duch, M., and Pedersen, F.S. 1996. Transcriptional silencing of retroviral vectors. J. Biomed. Sci., 3: 365–378.
Mochizuki, N., Yamashita, S., Kurokawa, K., Ohba, Y., Nagai, T., Miyawaki, A., and Matsuda, M. 2001. Spatio-temporal images of growth-factor-induced activation of Ras and Rap1. Nature, 411: 1065–1068.
Ngoi, S.M., Chien, A.C., and Lee, C.G. 2004. Exploiting internal ribosome entry sites in gene therapy vector design. Curr. Gene Ther., 4: 15–31.
O’Gorman, S., Fox, D.T., and Wahl, G.M. 1991. Recombinase-mediated gene activation and site-specific integration in mammalian cells. Science, 251: 1351–1355.
Olivares, E.C., Hollis, R.P., and Calos, M.P. 2001. Phage R4 integrase mediates site-specific integration in human cells. Gene, 278: 167–176.
Olivares, E.C., Hollis, R.P., Chalberg, T.W., Meuse, L., Kay, M.A., and Calos, M.P. 2002. Site-specific genomic integration produces therapeutic Factor IX levels in mice. Nat. Biotechnol., 20: 1124–1128.
Pannell, D. and Ellis, J. 2001. Silencing of gene expression: implications for design of retrovirus vectors. Rev. Med. Virol., 11: 205–217.
Piehl, M. and Cassimeris, L. 2003. Organization and dynamics of growing microtubule plus ends during early mitosis. Mol. Biol. Cell, 14: 916–925.
Proudfoot, N.J. 1986. Transcriptional interference and termination between duplicated alpha-globin gene constructs suggests a novel mechanism for gene regulation. Nature, 322: 562–565.
Sasaki, Y., Sone, T., Yoshida, S., Yahata, K., Hotta, J., Chesnut, J. D., Honda, T., and Imamoto, F. 2004. Evidence for high specificity and efficiency of multiple recombination signals in mixed DNA cloning by the Multisite Gateway system. J. Biotechnol., 107: 233–243.
Sauer, B. and Henderson, N. 1988. Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. Proc. Natl. Acad. Sci. USA, 85: 5166–5170.
Sauer, B. and Henderson, N. 1990. Targeted insertion of exogenous DNA into the eukaryotic genome by the Cre recombinase. New Biol., 2: 441–449.
Shaner, N.C., Campbell, R.E., Steinbach, P.A., Giepmans, B.N., Palmer, A.E., and Tsien, R.Y. 2004. Improved monomeric red, orange and yellow fluorescent proteins derived from Discosoma sp. red fluorescent protein. Nat. Biotechnol., 22: 1567–1572.
Smith, M.C. and Thorpe, H.M. 2002. Diversity in the serine recombinases. Mol. Microbiol., 44: 299–307.
Sone, T., Nishiumi, F. Yahata, K., Sasaki, Y., Kishine, H., Ando, T., Inoue, K., Thyagarajan, B., Chesnut, J., and Imamoto, F. 2009. Cell engineering using integrase and recombinase systems. In Emerging Technology Platforms for Stem Cells (U. Lakshmipathy, J.D. Chesnut, and B. Thyagarajan eds.)., Wiley and Sons, Inc., New Jersey, pp.379–394.
Sone, T., Yahata, K., Sasaki, Y., Hotta, J., Kishine, H., Chesnut, J.D., and Imamoto, F. 2008. Multi-gene gateway clone design for expression of multiple heterologous genes in living cells: modular construction of multiple cDNA expression elements using recombinant cloning. J. Biotechnol., 136: 113–121.
Sorrell, D.A. and Kolb, A.F. 2005. Targeted modification of mammalian genomes. Biotechnol. Adv., 23: 431–469.
Szymczak, A.L., Workman, C.J., Wang, Y., Vignali, K.M., Dilioglou, S., Vanin, E.F., and Vignali, D.A. 2004. Correction of multi-gene deficiency in vivo using a single ‘self-cleaving’ 2A peptide-based retroviral vector. Nat. Biotechnol., 22: 589–594.
Szymczak, A.L. and Vignali, D.A. 2005. Development of 2A peptide-based strategies in the design of multicistronic vectors. Expert Opin. Biol. Ther., 5: 627–638.
Thyagarajan, B., Olivares, E.C., Hollis, R.P., Ginsburg, D.S., and Calos, M.P. 2001. Site-specific genomic integration in mammalian cells mediated by phage φC31 integrase. Mol. Cell Biol., 21: 3926–3934.
Thyagarajan, B., Liu, Y., Shin, S., Lakshmipathy, U., Scheyhing, K., Xue, H., Ellerstrom, C., Strehl, R., Hyllner, J., Rao, M.S., and Chesnut, J.D. 2008. Creation of engineered human embryonic stem cell lines using phiC31 integrase. Stem Cells, 26: 119–126.
Wohlgemuth, J.G., Kang, S.H., Bulboaca, G.H., Nawotka, K.A., and Calos, M.P. 1996. Long-term gene expression from autonomously replicating vectors in mammalian cells. Gene Ther., 3: 503–512.
Yahata, K., Kishine, H., Sone, T., Sasaki, Y., Hotta, J., Chesnut, J. D., Okabe, M., and Imamoto, F. 2005. Multi-gene gateway clone design for expression of multiple heterologous genes in living cells: conditional gene expression at near physiological levels. J. Biotechnol., 118: 123–134.
Yahata, K., Maeshima, K., Sone, T., Ando, T., Okabe, M., Imamoto, N., and Imamoto, F. 2007. cHS4 insulator-mediated alleviation of promoter interference during cell-based expression of tandemly associated transgenes. J. Mol. Biol., 374: 580–590.
Yamanaka, S. 2008. Induction of pluripotent stem cells from mouse fibroblasts by four transcription factors. Cell Prolif., 41: Suppl 1, 51–56.