Individual chromosome identification , chromosomal collinearity and genetic-physical integrated map in Gossypium darwinii and four D genome cotton species revealed by BAC-FISH

Yimei Gan, Fang Liu, Renhai Peng, Chunying Wang, Shaohui Li, Xiangdi Zhang, Yuhong Wang and Kunbo Wang* State Key Laboratory of Cotton Biology (China)/Cotton Research Institute of Chinese Academy of Agricultural Science, Anyang, Henan, 455000, P.R.China The Institute of Tropical Bioscience and Biotechnology of Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan, 571101, P.R.China Anyang Institute of Technology, Anyang, Henan, 455000, P.R.China


INTRODUCTION
The chromosome recognition is the foundation of research on plant genetics, evolution and genomics.The conventional karyotypic analysis was used to play a great role in the plant chromosome identification.However, the method which is mainly based on comparison of the length of chromosomes has proved to be difficult and inac-curate for plants with most similar morphological chromosomes.Other attempts in chromosome identification have been reported, including C-banding (Borzan and Papes, 1978), G-banding (Drewry, 1982), and fluorescence banding (Hizume et al., 1983).These procedures were not yet used for the majority plant species because of their low reproducibility.Therefore, a simplified and reproducible karyotype method for identifying individual chromosomes in plants is necessarily to be developed.Recently, fluorescence in situ hybridization (FISH), which is used to localize DNA probes and identify chromosomes or chromosomal segments, have been adapted successfully to identify the chromosome for many plant species, including rice, potato, cotton and so on (Cheng et al., 2001;Dong et al., 2000;Gan et al., 2011;Kim et al., 2002;Ma et al., 2005;Peng et al., 2012;Wang et al., 2007Wang et al., , 2008;;Zhang et al., 2004).This technique has been developed from highly repeated copies sequences probe to singlecopy probe (Desel et al., 2001;Zhu et al., 1999) and from single-color to multiple-color (Tang et al., 2009).Moreover, as the improvement of its resolution and accuracy, FISH has been used not only to identify chromosomes but also to construct physical map and integrate genetic and physical maps (Cheng et al., 2001;Islam-Faridi et al., 2002;Kim et al., 2005;Kulikova et al., 2001).FISH has become the irreplaceable tool for filling the gap in the genomic sequencing (Jiang and Gill, 2006).
Cotton (Gossypium) is an important fiber crop, and a model plant in researches for cytogenetics, genomics and evolutionary biology.G. hirsutum is cultivated mainly for 90 percent production of fibers which valued about $20 billion annually, and sustain an annual $500 billion for textiles industries in the world (Paterson, 2009).Due to the poor genetic base of G. hirsutum, it is difficult to have a breakthrough in breeding work for new varieties with fine agronomic and economic traits using G. hirsutum germplasm itself.However, as wild cotton species have rich gene resource, cotton breeders can always focus on how to explore and to identify them effectively (Wendel et al., 2010).Out of 48 (in total 7 genomes) wild cotton species, tetraploid G. darwinii (A d D d ) has been considered to be closer to G. hirsutum than other tetraploid species.Also, thirteen species of D genome which were divided into two sections (including six subsections) have been studied and showed a better resource than other genomes, for their closeness with tetraploid cottons (Wendel and Cronn, 2003) and as well their fiber gene and traits with surprising contribution to tetraploid cottons (Jiang et al., 1998;Hovav et al., 2008;Rong et al., 2007).Out of the 13 species, based on morphological characteristics, G. klotzschianum (D 3k ), G. davidsonii (D 3d ), G. armourianum (D 2-1 ) and G. aridum (D 4 ) derived from the same branch which contains three subsections and is different from other three subsections derived from other branchs (Wendel and Cronn, 2003;Wendel et al., 2009Wendel et al., , 2010)).Moreover, D 3d is the possible donor of D d by genomic fluorescence in situ hybridization (GISH) analysis (Xiao et al., unpublished).For the excellent character, D 3k and D 4 were thought to be resistant to Verticillium and Fusarium wilt and with intensity fiber and resistant to spider mite and drought, respectively (Huang et al., 2003).It is considered that D 3d , sister species to D 3k , can tolerate salt and dought and be also highly resistant to cotton aphid, spider mite, foot spot disease,Verticillium and Fusarium wilt (Huang et al., 2003).For D 2-1 , it was focus on the drought, leafhopper, cotton bollworm, foot spot disease resistance, and the fine and intensity fibers (Huang et al., 2003).Tetraploid G. darwinii was reported highly resistant to drought and has fine fiber quality (Huang et al., 2003).So far, the genetic map and BAC library of G. darwinii are being constructed (Chen et al., unpublished).And some researches in D 3k , D 3d , D 2-1 and D 4 , including chromosome karyotyping analysis with classical method, system evolution using morphological feature and molecular evidences, rDNA location, and distant hybridization breeding, have been reported (Wang et al., 2001).However, as only a few wild cotton species seeds or little translocation materials are available, the usage of excellent gene and evolutional research proved to be limited and difficult.Also, chromosome identification using FISH and the related standard nomenclature that linked to genetic map for the five species, which will provide useful information for exploring germplasm resources, remain largely unavailable.
Gossypium is a huge genus, including diploid and polyploidy species, more than 50 species, being divided into seven genomes.Great species diversity from morphological characteristics to geographical distribution, from chromosomes to sequence structure, causes different traits among intragenome, intergenome, intraspecies and interspecies.Also, Gossypium species are polyploidy which retained whole-genome duplications (WGDs), even the ancestors of diploid species may be ancient tetralpoid by genetic mapping analysis (Yang, 2001).Genome duplication often accompanied by increases in species diversity such as reciprocal gene loss, subfunctionalization and speciation (Van de Peer et al., 2009).Therefore, thirteen species of D genome show different trait derived from different gene and sequences, indicating chromosome-specific BAC clones available to some species may be unfit to other species.
In this study, in order to establish the theoretical basis of cytogenetic, evolutional and genomic researches, we identified and designated individual chromosome for G. darwinii, D 3k , D 3d , D 2-1 and D 4 .And 45S and 5S rDNA were located to the specific chromosomes for the five species, respectively.Also, the signal positions of each marker on the corresponding chromosomes among different genomes were compared in order to identify the relation-Chromosome identification and integration in cotton ship of chromosomal collinearity.Moreover, comparison with genetic map was conducted.The orientation of linkage group in D d was established in the study using the D genome centromere-specific clone.

Plant materials and clones
The species and their genomes and accessions used in this study were as following: G. klotzschianum, D 3-k (briefly here as D 3k ), D 3-k 57; G. davidsonii D 3-d (briefly here as D 3d ), D 3d 2; G. armourianum, D 2-1 , D 2-1 -2; G. aridum, D 4, D 4 -2, and G. darwinii, (AD) 4 (briefly here as AD 4 ), AD 4 -7.The accessions of these plants are growing perennially in National Wild Cotton Plantation in Sanya City, Hainan Island, sponsored by Cotton Research Institute of Chinese Academy of Agricultural Sciences (CRI-CAAS) in Anyang City, Henan Province, China.These five accessions are also conserved in pots in greenhouse at CRI-CAAS.
For A subgenome of G. darwinii, three types of probes were used in this study, including 45S rDNA, 5S rDNA and a set of A h chromosome-specific BAC clones.For D subgenome of G. darwinii, D 3k , D 3d , D 2-1 and D 4 , four types of probes were used, including 45S rDNA, 5S rDNA, BAC clone 150D24 and a set of D h chromosome-specific BAC clones.The A h (D h ) chromosome-specific BAC clones used in the identification of individual chromosome were kindly provided by Professor Tianzhen Zhang of Nanjing Agricultural University, China (Wang et al., 2007).The BAC clone 150D24 which contains centromere-specific repeats in D subgenome and D genome of Gossypium was screened from Pima 90-53 BAC library (Wu et al., 2010) and was used to indicate the centromere position.The 45S and 5S rDNA derived from Arabidopsis thaliana were kindly provided by Professor Yunchun Song of Wuhan University, China.

DNA probes preparation
The probes 45S, 5S rDNA and BAC DNA were isolated using a standard alkaline extraction (Sambrook and Russell, 2002).45S rDNA and BAC clone 150D24 were labeled by standard Dig-nick translation reactions, whereas 5S rDNA and a set of A h (D h ) subgenome chromosome-specific BAC clones (Wang et al., 2007) were labeled with Biotin-nick translation reactions, according to the instructions of the manufacturer (Roche Diagnostics, USA).
Chromosome preparation and FISH Preparation of mitotic chromosomes and the FISH procedure were conducted following the modified protocols (Wang et al., 2001).Biotin-labeled and digoxigenin-labeled probes were detected by avidin-fluorescein (green) and antidigoxigenin-rhodamine (red) (Roche Diagnostics, USA), respectively.Chromosomes were counterstained by 4', 6diamidino-2phenylindole (DAPI) in the antifade VECTASHIELD solutions (Vector Laboratories, Burlingame, CA).For the probe-cocktail mixture, gDNA was used as block DNA instead of Cot-1 DNA.The dose of block DNA was 200 times of the chromosome-specific BAC DNA.The hybridization signals were observed using a fluorescence microscope (Leica MRA2) with a charge-coupled device (CCD) camera (Zeiss) and arranged by Adobe Photoshop 7.0.

Primary location of rDNA for five species
To distinguish the signals of rDNA and BACs, the location and number of 45S rDNA and 5S rDNA were detected (Fig. 1).In G. darwinii, three 45S rDNA loci at the end of the short arm of chromosomes were revealed (Fig. 1a), whereas two 5S rDNA loci were intercalary on the short arm of chromosomes (Fig. 1a).In G. klotzschianum and G. davidsonii, four 45S rDNA loci at the end of the short arm of the chromosomes were revealed (Fig. 1, b and c), whereas one 5S rDNA locus was intercalary on the short arm of chromosomes (Fig. 1, b and c).In G. armourianum and G. aridum, three 45S rDNA loci at the end of the short arm of the chromosomes were revealed (Fig. 1, d  and e), whereas one 5S rDNA locus was intercalary on the short arm of chromosomes (Fig. 1, d and e).

Individual chromosome identification
The chromosome-specific BAC clones were screened using chromosome-specific SSR markers derived from A and D subgenomes, respectively.A total of 13 SSR markers from A subgenome were examined to be all-existed in A subgenome of G. darwinii detected by PCR.In addition, 13 pairs of SSR markers from D subgenome were examined to be all-existent in D subgenome of G. darwinii, D 3k , D 3d , D 2-1 and D 4 using PCR.Most of the SSR markers were amplified with expecting size (Data not shown).Therefore, two sets of BAC clone (A h 01-A h 13 and D h 01-D h 13) from the A and D subgenomes of G. hirsutum can be used as FISH markers to identify the individual chromosomes for G. darwinii, D 3k , D 3d , D 2-1 and D 4 .
All the individual metaphase chromosomes of mitosis were identified by two-color BAC-FISH (FISH probed with BAC clone) for A subgenome of G. darwinii (A d ).
The results showed that all 13 A h chromosome-specific BAC clones were located to corresponding chromosomes and were confirmed to be a set of chromosome-specific molecular cytological markers of G. darwinii (The FISH images of whole chromosome spreads are not shown).According to the homology between A d and A h , the systematic nomenclature of chromosomes for A d was established and the chromosomes were named A d 01-A d 13, respectively.The relative positions of the 13 chromosome-specific BACs were revealed by measuring the red signals at the centromere region with isis software.A d 01, rDNA location Based on the successful identification of individual chromosomes, the positions to the chromosomes and chromosome arms of 45S and 5S rDNA for A d , D d , D 3k , D 3d , D 2-1 and D 4 were revealed.Three 45S rDNA loci at the end of the short arm of the chromosomes A d 09, D d 07 and D d 09 of G. darwinii were revealed (Fig. 2, a and b), whereas two 5S rDNA locus was intercalary on the short arm of chromosomes A d 09 and D d 09 (Fig. 2, a and  b).D 3d and D 3k have the same rDNA number and chromosome locations.Four 45S rDNA loci were located at the end of the short arm of the chromosomes D 3k 05 (D 3d 05), D 3k 07 (D 3d 07), D 3k 09 (D 3d 09) and D 3k 12 (D 3d 12) (Fig. 2, c and d), whereas one 5S rDNA locus was intercalary on the short arm of chromosomes D 3k 09 (D 3d 09) (Fig. 2, c and d).Three 45S rDNA and one 5S rDNA loci were revealed in both D 2-1 and D 4 .Three 45S rDNA were all located at the end of the short arm of the chromosomes D 2-1 05(D 4 05), D 2-1 07(D 4 07) and D 2-1 09(D 4 09) (Fig. 2, e and f), and the 5S rDNA was positioned intercalary on the short arm of chromosomes D 2-1 09(D 4 09) (Fig. 2, e and  f).Taken together, the 45S and 5S rDNA were synteny in A d 09, D d 09, D 3k 09, D 3d 09, D 2-1 09 and D 4 09.

Comparative analysis within BAC clones and the related SSR markers to genetic map
The orientations of chromosome arms were determined according to  The relative positions of interspecies BAC clones and their genetic and physical map were compared using BAC-FISH.For the chromosomal collinearity among the five species, the relative positions of the 13 BAC clones of D h and the corresponding 13 SSR markers were compared and integrated (Fig. 3).The genetic positions of 13 SSR markers were followed high density genetic map in cotton (Guo et al., 2007).The arm positions to linkage map were estimated by the relative BACs positions on chromosomes.So the orientations of genetic linkages were determined according to positions of chromosomespecific SSR markers.The short arm is indicated by conventional position on the top.The comparative results showed that seven SSR markers had same positions as original genetic linkages except six linkages.The orien-tations of the six linkages 03, 04, 06, 09, 10 and 12 were founded to be reversed.NAU657 (Chr.03) and BNL358 (Chr.04) both derived from the middle part of the short arm to genetic linkage were FISH mapped close to the centromere regions of the short arm, and the end of the long arm, respectively.BNL169 (Chr.10) from the middle part of the long arm to genetic linkage was located to the end of the long arm.BNL358 (Chr.06),BNL3511 (Chr.09) and BNL1669 (Chr.12)derived from the middle part of the short arm to genetic linkage were all FISH mapped near to the centromere regions of the long arm.BAC positions and SSR marker positions to the genetic linkage was compared with orientations established.The result showed that most of the positions were homologous except BNL358.BNL358 in the middle of linkage group was FISH mapped and was showed nearby the end of chromosome 04 by FISH.

DISCUSSION
Chromosome identification Chromosome identification is the foundation of cotton genomic research.Due to so many similar morphology and small chromosomes in cotton species, cytogenetic research such as chromosome recognition has been far lag behind other plants.BAC-  2007), but some orientations of linkages were reversed with the positions of chromosome-specific BACs: linkages 01, 02, 04 and 10); the numbers indicate the positions of markers to the linkage groups.Positions of loci are given in centi-Morgans.Chromosome identification and integration in cotton FISH, which probes are BAC clones screened from BAC library with genetic molecular markers, has proved to be a high efficient method to identify plant chromosomes (Cheng et al., 2001;Dong et al., 2000;Gan et al., 2011;Hasterok et al., 2006;Kim et al., 2002;Kim et al., 2009;Kulikova et al., 2001;Wang et al., 2007Wang et al., , 2008 ) ).With the improvement of cotton genomics research, cotton cytogenetics has made a lot of progress and chromosome recognition of several species has been finished, including G. hirsutum (Wang et al., 2007), G. arboreum (Wang et al., 2008), G. thurberi and G. trilobum and D subgenome of G. barbadense (Gan et al., 2011).In present study, the relative chromosomal positions of centromere and the individual chromosome-specific BAC clones have been successfully located to the corresponding chromosomes in D d , D 3k , D 3d , D 2-1 and D 4 using D genome centromerespecific and chromosome-specific BAC clone.And the chromosomes in these species which played a surprising role in cotton fiber traits improvement (Paterson et al., 2010) were assigned.Due to the principle of chromosome identification is based on the physical location, the result is unique and reliable.The accuracy of identification provided excellent cytogenetic information for cotton genomics research.First of all, the unreliable problem in rDNA location for these species would be resolved.Moreover, chromosome identification will allow researchers to correctly locate physically objective sequences to the special chromosomes like other plants (Jiang et al., 1995;Kim et al., 2002;Yan et al., 1998).The EST sequence related to cotton heterosis has been successfully located to D h 05 (Wang et al., 2006).Last but not at least, due to the same chromosome-specific markers from G. hirsutum, the relationship among different species was integrated, which will facilitate the cotton germplasm usage and evolution research.
Chromosome systematic nomenclature based on the homoesis relationship In classical karyotyping analysis, cotton chromosomes were classified and named according to the relative chromosome length.The kind of nomenclature approach is equivocal and not efficient enough because it is inaccurate and not linked to the genetic linkage map for similar and small chromosomes.The recent research showed that these problems might be resolved by using BAC-FISH.Chromosomes of G. hirsutum were identified by BAC-FISH and named as A01-13 and D01-13 for A and D subgenome, respectively (Wang et al., 2006).With the same process, chromosomes of G. arboreum were assigned and named A01-13 (Wang et al., 2008).The chromosome designations of the two species were both linked to genetic map, however, it seemed to be confused in different species.In order to distinguish the nomenclature for chromosomes of Gossypium species, chromosomes of G. hirsutum A h (D h ), G. thurberi (D 1 ), G. trilobum (D 8 ) and D subgenome of G. barbadense (D b ) have been named A h 01-13(D h 01-13), D 1 01-13, D 8 01-13, D b 01-13, respectively, based on the original abbreviation of species names and BAC-FISH (Gan et al., 2011).In present research, we also systematic named the chromosomes of G. darwinii A d (D d ), G. klotzschianum (D 3k ), G. davidsonii (D 3d ), G. armourianum (D 2-1 ) and G. aridum (D 4 ) as A d 01-13(D d 01-13), D 3k 01-13, D 3d 01-13, D 2 -1 01-13, D 4 01-13, respectively.The nomenclature would be consistent with that in genetic linkage map and in classical genetics with genetic materials (Wang et al., 2006), as well as unified among different cotton species, as the chromosomes of all these species were identified with the same set of chromosome markers.This will facilitate mutual research in different fields and the interspecies evolution study.

Comparative analysis of BACs in different cotton
species Interspecies collinearity provides invaluable information about inferring the common ancestor of genes and studying species with limited genomic information by comparing with model plants with that of rich genomic message (Tang et al., 2009).By comparing with genetic markers, the relationship among different species could be revealed (Ma et al., 2006;Wang et al., 2011), but mapping group and polymorphism are still required.Whereas, comparative mapping among close species by BAC-FISH does not rely on mapping group and polymorphism, and the effective fragments are larger, then the homologous or rearrangement found were greater.Therefore, based on the physical comparative mapping by using FISH, chromosome rearrangement and collinearity between Arabidopsis and Brassica (Howell et al., 2005;Ziolkowski and Sadowski, 2002), tomato and potato (Iovene et al., 2008;Szinay et al., 2008;Zhu et al., 2008), Oryza genus and Sorghum genus (Hass-Jacobus et al., 2006) were revealed, which provided information of systematic development and evolution research.In the present study, the corresponding locations between D genome and D subgenome as well as within D genomes indicated a conserved collinearity relationship.As BAC-FISH usually revealed 100 kb fragment, our results suggested a large-segment chromosome region between D genome and D subgenome as well as within D genomes.As D 3k , D 3d , D 2-1 and D 4 were divided into the same branch based on morphological characteristics, our result will offer new cytogenetic evidence for the interspecific evolution and classification of D genome.In general, the comparison of BACs will help sequence assembling during the whole genomic sequencing and genomic application for interspecies evolution and breeding.

Integration of genetic and physical map
The genetic map and physical map were integrated to analysis their relationship.The orientation of genetic map was established, improving the relationship between genetic and physical map.And the comparative results showed that most of SSR markers and BAC clones had corresponding locations except one case.Such a high similarity between genetic map and physical map may reveal a good coverage rate for genetic linkage map, which were different from that of Sorghum, Medicago truncatul and wheat (Kim et al., 2002;Kulikova et al., 2001;Zhang et al., 2004).Moreover, given that the genus Gossypium shares the whole-genome duplication events, repetitive sequence blocks and molecular fingerprinting errors have made it difficult to assemble the physical map in Gossypium.Using BAC-FISH presented here, individual BAC clones can be accurately localized to chromosomes and linkage groups.So the integrated map will assist the sequence assembling of ongoing whole-genome sequence project.As the greatest challenge facing the cotton research is the conversion of sequence to knowledge, not whole-genome sequencing per se (Paterson et al., 2010), integration of genetic and physical map will provide important information for cotton breeding and evolution with sequence data.Besides, the utilization of excellent traits in five wild species may be especially important for sustainable and improved cotton production.An estimate of the physical distance between probes is valuable when positional cloning is being considered to clone useful genes.Our integrated result would be expected to play an important role in expediting the application of wild resources with useful information.Therefore, the integration and establishment of genetic map and physical map, as well as the establishment of genetic map orientation in the five wild cotton species, will provide useful and fundamental information by their construction of genetic map, gene clone, genome sequencing and the conversion of sequence.

Fig. 2 .
Fig. 2. Identification of individual chromosome and rDNA in A d and D d of G. darwinii, G. klotzschianum.G. davidsonii, G. armourianum and G. aridum.Dual-FISH images with BAC signals on the corresponding chromosome for five species, respectively.The chromosomes were chose with BAC clones.DNA: green and weak fluorescence signals; 5S rDNA: green and strong fluorescence signals near the centromere of the short arm of chromosomes; 45S rDNA: red and strong fluorescence signals at the terminal of the short arm of chromosomes; centromere clone 150D24: red fluorescence signals at the intercalary chromosomes.The short arm and the long arm were distinguished by the location of 150D24.A d : A subgenome of G. darwinii, D d : D subgenome of G. darwinii, D 3k : G. klotzschianum, D 3d : G. davidsonii, D 2-1 : G. armourianum, D 4 : G. aridum, D d of G. darwinii.Bar =5 μm.

Fig. 3 .
Fig. 3. Individual chromosomes of D genomes and D t with FISH signals derived from the chromosome-specific BACs, and comparisons of linkage and FISH map.A-E indicated chromosomes (Chr.01-913)from D 3k , D 3d , D 2-1 , D 4 and D d , respectively.F: Linkage maps show three loci, including two end markers and one marker used to select BAC clones for FISH (According to Guo et al. (2007), but some orientations of linkages were reversed with the positions of chromosome-specific BACs: linkages 01, 02, 04 and 10); the numbers indicate the positions of markers to the linkage groups.Positions of loci are given in centi-Morgans.