Edited by Koji Murai. Saneyoshi Ueno: Corresponding author. E-mail: saueno@ffpri.affrc.go.jp

Index
INTRODUCTION
MATERIALS AND METHODS
Plant material and mRNA extraction
cDNA library construction and sequencing
Processing of EST sequences
Analysis of EST-SSR markers
RESULTS
Unigene construction and similarity searches
Microsatellite mining
Analysis of EST-SSR markers
DISCUSSION
Characteristics of the EST-SSRs
Analysis of EST-SSR markers
References

INTRODUCTION

The areas that have been reforested in Japan have increased three-fold in recent years – from ca 34,000 ha in 1960 to ca. 221,000 ha in 2000 – and the areas reforested by broad-leaved species have also rapidly increased (Ministry of Agriculture, Forestry and Fisheries of Japan, 2002). Among broad-leaved species, members of the Fagaceae have been widely used, and the area reforested by Quercus species has reportedly doubled in the past 20 years. Ideally, the genetic composition and diversity of the populations used as sources of planting material should be carefully analyzed in such reforestation programs, for two reasons: firstly, to maintain diversity in the planted material and, secondly, to maximize the chances of establishment, since provenance tests have shown that large proportions of genetic differentiation can be attributed to climatic adaptation (Matyas, 1996; Saenz-Romero et al., 2006). If adaptive genetic variation is ignored, there may be severe detrimental effects, potentially including complete silvicultural failure. Generally, seedlings of local provenance will be ideal materials for reforestation. However, there is no clear agreement regarding precisely how “local” such materials need to be (McKay et al., 2005). Analysis of genetic diversity using suitable markers should help to resolve this issue.

In recent years, EST (Expressed Sequenced Tag) -based markers have been increasingly widely developed and used for both linkage map construction (Tani et al., 2003) and analysis of genetic variation (Rowland et al., 2003; Schubert et al., 2001). ESTs are one pass-reads from expressed regions of a genome that can be easily acquired by high-throughput capillary sequencers. The generation of ESTs is relatively rapid and they can be partially sequenced cost-effectively to obtain massive amounts of information on gene expression and coding sequences of genomes. Furthermore, since ESTs frequently include coding regions of their respective genomes, their analysis provides valuable information on functional sequences, which is especially valuable for cases where the species’ genomes have not yet been fully sequenced. In addition, some variations in DNA sequences derived from ESTs may be related to natural selection (Kado et al., 2003), thus ESTs are also useful for identifying adaptive genetic variations.

Quercus mongolica (Fagaceae) is a deciduous tree that occurs on all of the main Japanese islands (Hokkaido, Honshu, Shikoku and Kyushu). Genomic microsatellite markers for Q. mongolica have been developed (Mishima et al., 2006) and registered in publicly available databases. However, no ESTs or EST-based markers have been previously developed for this species. Therefore, in the study presented here, we generated and characterized EST sequences in order to develop EST-based genetic markers of Q. mongolica var. crispula to facilitate genetic diversity analyses. We collected 3385 EST sequences, analyzed the relative abundance of each EST, and performed similarity searches to infer their putative functions. Furthermore, simple sequence repeats (SSRs) were mined within our library. PCR primers for these sequences were developed and their polymorphisms were surveyed. The sequences characterized and EST-SSR markers developed in this study should be valuable resources for the analysis of genetic diversity of Q. mongolica and related species.


MATERIALS AND METHODS

Plant material and mRNA extraction

An adult Q. mongolica tree (ca. 25 cm in diameter at breast height) growing in the arboretum of the Forestry and Forest Product Research Institute (36°00.8’N, 140°13.0’E) was selected as a source of RNA, and in May 2005 several twigs at about 5 m above ground were cut from it. The twigs were chopped into sticks of about 1.5 cm in diameter and 25 cm in length. The outer bark was peeled from the sticks using a cutter. The inner bark was then stripped, immediately frozen in liquid nitrogen, sliced and ground in liquid nitrogen using a mortar and pestle. About 50 g of inner bark slice was used for RNA extraction. Total RNA was extracted by the CTAB method (Chang et al., 1993) and further purified using an SV total RNA isolation system (Promega, Madison, USA).

cDNA library construction and sequencing

The mRNA was purified using an Oligotex-dT30 Super mRNA purification Kit (Takara, Japan), then a library of the sequences obtained was constructed using a cDNA Library construction Kit (Stratagene, La Jolla, USA). First-strand cDNA synthesis was carried out with oligo d(T)18 primers. Synthesized cDNAs were size-selected and ligated into pBluescript II SK(+) vectors (Stratagene, La Jolla, USA). The ligated vectors were then transformed into competent DH10B Eschericha coli cells by electroporation. The competent cells were plated onto LB media and incubated overnight. White colonies were randomly selected and sequenced from the 5’ end using a MegaBACE4000 sequencer (Amersham Bioscience) and T3 primers. The cDNA library construction and sequencing were carried out by the Dragon Genomics center in Mie Prefecture, Japan.

Processing of EST sequences

The sequence data were analyzed by PartiGene software (Parkinson et al., 2004) as follows. We used trace2dbest, part of the PartiGene package to process raw trace files, perform base calling and remove vector and low quality sequences with Phred (Ewing and Green, 1998; Ewing et al., 1998). The error probability cutoff value of Phred was set at the default value of 0.05. Sequences less than 150 bp long after Phred processing were removed from the following analysis. The remaining sequences were used for clustering with CLOBB (Parkinson et al., 2002) under its default settings. Clusters containing more than one sequence were assembled into consensus sequences with Phrap (Green, 1999) under its default settings expect that the forcelevel was set at 10 according to the suggestion in “UserGide & Tutorial” of the PartiGene. All contigs (including singletons, for which only one sequence was included in a contig) were assumed to be unigenes and used for primary annotation through similarity searches against the NCBI nr database using the Blastx (Altschul et al., 1990) algorithm with an e-value cutoff of 1e-5. The unigenes were functionally classified through similarity searches against uniprot (uniprot_trembl and uniprot_sprot) databases (Apweiler et al., 2004) with an e-value cutoff of 1e-25 according to annotation based on GO and GO slim terms (Harris et al., 2004), using annot8r_blast2GO, a sequence annotation script published by Schmid & Blaxter (http://www.nematodes.org/PartiGene/index.html). Uniqueness of the unigenes was assessed by similarity search against NCBI UniGenes of Arabidopsis thaliana (UniGene At build 52), Oryza sativa (UniGene Os build 61) and Poplus balsamifera (UniGene Pba build 2), using tBlastx algorithm with an e-value cutoff of 1e-5. Next, microsatellites were surveyed using SSRIT (Temnykh et al., 2001); a microsatellite search tool available from the USDA-ARS Center for Bioinformatics. Sequences with at least nine, six and five repeats for di-, tri- and tetra-SSRs, respectively, were surveyed. Whether the microsatellites within each unigene were located in coding or non-coding (5’ UTR and 3’ UTR) region was assessed using ESTScan (Iseli et al., 1999). The functions of unigenes harboring microsatellite sequences were analyzed according to the GO annotations described above. To assess whether the number of microsatellite-containing unigenes in specific categories significantly deviated from the numbers expected from random sampling, samples of 118 were randomly selected from the 1109 GO-annotated unigenes 1000 times, and 95% confidence limits for the frequency of unigenes in each GO category, were determined, using Perl scripts written in-house.

Analysis of EST-SSR markers

PCR primers for di- and tri-SSRs were designed by Primer3 (Rozen and Skaletsky, 2000), called from read2Marker script (Fukuoka et al., 2005) under its default settings. The utility of EST-SSR primers designed in the present study was demonstrated by analyzing polymorphisms among individuals of Q. mongolica, Q. serrata and Q. dentata (Table 1). These three species are sufficiently closely related to hybridize with each other. Eight individuals of each of the three species were genotyped, by PCR carried out in 6 μL of reaction mixtures containing ca. 10 ng genomic DNA, 1 × PCR buffer, 200 μM of each dNTP, 1.5 mM MgCl2, 0.2 μM of each primer designed in the present study and 0.15 U of Taq polymerase (Promega), using the following program: 94°C for 3 min, then 40 cycles of 94°C for 45 sec, 55°C for 45 sec and 72°C for 45 sec, followed by a final extension at 72°C for 7 min. PCR products were labeled with ChromaTide Rhodamine Green-5-dUTP (Molecular Probes Eugene, USA) according to the method of Kondo et al (2000), and analyzed using a 3100 Genetic Analyzer with GeneScan software (Applied Biosystems, Foster City, USA). For each locus, the number of alleles (Na) was counted and observed heterozygosity (Ho) was calculated. The proportion of shared alleles between individuals was computed by the MSA software (Dieringer and Schlötterer, 2003) and an NJ dendrogram was constructed with the PHYLIP package (Felsenstein, 1989). Dendrograms were graphically displayed by MEGA version 3.1 (Kumar et al., 2004).


View Details
Table 1
Locations of samples used for polymorphism screening



RESULTS

Unigene construction and similarity searches

A total of 3642 reads were obtained. After Phred processing, 3385 reads proved to have high quality sequences more than 150 bp long. The average length of high quality sequences was 558 bp. They were then grouped into 2119 sequence clusters, 1672 of which comprised single EST sequences. The remaining 447 clusters included 1713 EST sequences and assembled into 468 contigs. In total, 2140 unigenes were identified. All but one cluster comprised less than 30 ESTs, and one cluster included 102 sequences.

The Blastx search against the NCBI nr database showed that 1702 contigs had similarity with other proteins in the database, while no similar proteins were detected for the remaining 438 at an e-value cutoff of 1e-5. Functional classification based on GO terms assigned 538, 976 and 825 of the unigenes to the cellular component, molecular function and biological process categories, respectively, and the largest proportions of terms within these categories to the intracellular (GO:0005622; 75%), catalytic activity (GO:0003824; 35%) and physiological process (GO:0007582; 56%), subcategories (Fig. 1). In total, 1109 unigenes were assigned at least one GO term. The tBlastx search against the NCBI UniGenes of Arabidopsis thaliana, Oryza sativa and Populus balsamifera showed that 1759 contigs had similarities with at least one UniGenes within the three species. However, the remaining 381 had no similarities with those UniGenes at an e-value cutoff of 1e-5 and regarded as new transcripts in Q. mongolica inner bark.


View Details
Fig. 1
Functional profile of the 1109 Q. mongolica unigenes annotated according to GO slim terms (open squares). The error bars indicate 95% confidence limits for the frequencies of unigenes annotated with a specific GO slim term when 118 unigenes were randomly sampled 1000 times. The circles show functional profiles for 118 microsatellite-containing unigenes with GO annotation. Closed circles indicate significantly highly represented GO slim terms amongst unigenes with microsatellites. The GO IDs and corresponding terms are as follows: GO:0005622, intracellular; GO:0005623, cell; GO:0005576, extracellular region; GO:0005941, unlocalized protein complex; GO:0003824, catalytic activity; GO:0005488, binding; GO:0003676, nucleic acid binding; GO:0005198, structural molecule activity; GO:0005215, transporter activity; GO:0004871, signal transducer activity; GO:0005554, molecular function unknown; GO:0030528, transcription regulator activity; GO:0030234, enzyme regulator activity; GO:0003774, motor activity; GO:0007582, physiological process; GO:0006139, nucleobase, nucleoside, nucleotide and nucleic acid metabolism; GO:0006810, transport; GO:0006118, electron transport; GO:0050896, response to stimulus; GO:0006519, amino acid and derivative metabolism; GO:0007154, cell communication; GO:0007275, development; GO:0050789, regulation of biological process; GO:0006928, cell motility; GO:0009987, cellular process.


Microsatellite mining

The SSRIT script identified 274 microsatellites in 2140 unigenes (Table 2), but microsatellite motifs were present in only 248 of the unigenes, because some of them contained multiple microsatellites. The most frequent microsatellite had AG motifs, and we found no CG repeats in the unigenes. The frequency of di-SSRs decreased as the number of repeats increased; about 90% of the di-SSRs had less than 17 repeats. Among tri-SSRs (90% of which had less than 10 repeats), AAG and AAC repeats were relatively frequent (Table 2). However, there were far fewer tri-SSRs than di-SSRs. We found 10 tetra-SSRs in the unigenes. The maximum number of repeats was 49 (of the AG motif). Analysis of the location of the SSRs within genes indicated that 172 (63%) of them reside in non-coding regions, and 83 (30%) of them are likely to be located in coding regions. The locations of the remaining 19 (7%) of the SSRs were not determined, but they are likely to be non-coding. Most of the di-SSRs were located in non-coding regions, 69.5% of tri-SSRs appeared to be coding regions, and 80% of the tetra-SSRs in non-coding regions (Fig. 2). Most of the non-coding SSRs (90.7%) was inferred to be in 5’ UTR, probably because the EST sequencing was performed from 5’ end.


View Details
Table 2
Abundance of microsatellites in the Q. mongolica unigenes





View Details
Fig. 2
Frequency distribution of di-, tri- and tetra-SSRs in coding and non-coding locations.


GO annotations, which were assigned to 118 microsatellite-containing unigenes in total, indicated that the following terms were strongly represented in them: nucleic acid binding (GO:0003676; 24.6%), nucleobase, nucleoside, nucleotide and nucleic acid metabolism (GO:0006139; 24.5%), and cell communication (GO:0007154; 9.4%) (Fig. 1). According to the random sampling tests (see above), the most highly represented GO terms (GO:0003676, GO:0006139 and GO:0007154) proved to be significantly more strongly represented than expected (P < 0.05) among the unigenes harboring microsatellites (Fig. 1), After Bonferroni correction for multiple testing, only GO:0003676 (nucleic acid binding) was significantly overrepresented (P < 0.001).

Analysis of EST-SSR markers

In total, 32 and 13 primer pairs were designed for di- and tri-SSRs, respectively, and 31 of them successfully amplified Q. mongolica genomic DNA sequences. The remaining 14 primer pairs showed multiple banding patterns or produced larger products than expected (more than 500 bp), which were excluded for future analysis. When the successful primer pairs were applied to related species, 28 (90%) and 31 (100%) of them amplified Q. serrata and Q. dentata DNA sequences, respectively, both of which belong to the same section as Q. mongolica (Kanno, 2005; Manos and Stanford, 2001). After fluorescent labeling and electrophoresis by the capillary sequencer, 20 primer pairs (Table 3) showed a clear single locus amplification pattern for eight Q. mongolica genomic DNA samples. Some primer pairs produced 3 peaks and/or peaks that were difficult to genotype and are not listed in Table 3. The number of alleles per locus (Na) and observed heterozygosity (Ho) ranged from 2 to 12, and from 0.25 to 1.00, respectively (Table 4). In addition, genotypes of the eight individuals representing Q. serrata and Q. dentata were determined at 19 and 20 loci, respectively, and their levels of polymorphism were found to be nearly as high as those in Q. mongolica. The genotypic data for all of the loci except QmC01758 (which was not examined in Q. serrata), were used to calculate shared allele distances and construct an NJ dendrogram for the 24 individuals representing the three species (Fig. 3). The 24 individuals were roughly clustered by species. There were no significant relationships between geographic distance and shared allele distance among individuals.


View Details
Table 3
Characteristics of the Q. mongolica EST-SSR markers





View Details
Table 4
Polymorphisms for EST-SSR markers for Quercus mongolica (Qm), Q. serrata (Qs) and Q. dentata (Qd)





View Details
Fig. 3
NJ dendrogram for individuals of Q. mongolica (Qm), Q. serrata (Qs) and Q. dentata (Qd) based on the proportion of shared alleles for 19 EST-SSR markers. Individual tree IDs (Table 1) are shown following species abbreviations.



DISCUSSION

Characteristics of the EST-SSRs

The frequency analysis of the di-SSRs found in this study (Table 2) showed that the AG motif was the most abundant, in accordance with the general tendency for this motif to be the most common in plant genomes (Lagercrantz et al., 1993), especially in ESTs (Morgante et al., 2002). An SSR survey of tri-SSRs in Arabidopsis thaliana found that the AAT repeat was the most common and AAG the next most frequent (Katti et al., 2001), while CCG has been found to be the most common in Oryza sativa (La Rota et al., 2005). The tri-SSR frequency pattern we found in Q. mongolica was more similar to the A. thaliana than the O. sativa pattern reported by the cited authors. Furthermore, tri-SSRs were more concentrated in coding-regions of the unigenes, while di-SSRs were located mostly in non-coding regions (Fig. 2), probably due to selective pressures against frameshift mutations in coding regions (Metzgar et al., 2000).

Functional analysis of the unigenes and EST-SSRs based on GO slim terms (Fig. 1) revealed that the EST-SSRs were significantly more frequent than expected by chance in nucleotide-binding proteins (GO:0003676). More specifically, many of the Q. mongolica unigenes with GO:003676 had similarity to transcription and translation-related proteins (data not shown). This may reflect the putative importance of the flexibility conferred by mono-amino acid repeats in the proteins (transcription factors) found with nucleic acids in transcription complexes (Faux et al., 2005). Previous analyses of A. thaliana and O. sativa proteins have shown that amino acids repeats are also overrepresented in their transcription factors (Zhang et al., 2006).

Analysis of EST-SSR markers

In the present study, we developed 20 EST-SSR markers for Q. mongolica. The number of alleles per locus (Na) and observed heterozygosity (Ho) ranged from 2 to 12, and from 0.25 to 1.00, respectively (Table 4). Mishima et al (2006) surveyed the polymorphisms of 11 genomic di-SSR markers in 67 Q. mongolica individuals, and reportedly found 5 to 18 alleles per locus with observed heterozygosities values ranging from 0.522 to 0.896. Thus (although the number of individuals genotyped differs substantially between the present investigation and the cited study), the level of polymorphism appears to be higher for genomic SSRs (gSSR) than for EST-SSRs, in accordance with trends found in a systematic study of sunflower (Pashley et al., 2006). However, Pashley et al. (2006) also found no significant differences between the levels of variability in transferable EST-SSR and gSSR markers in their systematic comparison. EST-SSR markers are reported to be more transferable to related species than anonymous gSSR markers (Chabane et al., 2005; Chagne et al., 2004; Rungis et al., 2004). Therefore, EST-SSR markers are likely to be especially useful in analyses of genetic diversity in populations in hybrid zones (where simultaneous analyses of hybridizing species are required) or at the distributional margins of species (where null alleles are likely to be present) since the frequency of null alleles is likely to be lower for EST-SSR markers, due to their conservative nature, than the corresponding frequencies of other types of markers that could be used. Ishida et al. (2003) detected hybrid individuals between Q. mongolica and Q. dentata in a natural population, using morphological and AFLP data. In such a situation, EST-SSR markers developed in the present study should be useful for analyzing gene flow between species. We believed that the PCR primers designed in the present study should be of considerable assistance in future studies.

The authors are grateful to M. Kanno, A. Kanazashi, H. Yoshimaru, K. Ishida, Y. Koyama and K. Koono for sample collection. This research was supported by a grant for Research on Genetic Guideline for Restoration Programs using Genetic Diversity Information from the Ministry of Environment, Japan. The ESTs described in this paper have been submitted to the DDBJ with the accession numbers DB996174-DB999558.


References
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
Apweiler, R., Bairoch, A., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., et al. (2004) UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 32, D115–119.
Chabane, K., Ablett, G. A., Cordeiro, G. M., Valkoun, J., and Henry, R. J. (2005) EST versus genomic derived microsatellite markers for genotyping wild and cultivated barley. Genet. Resour. Crop Evol. 52, 903–909.
Chagne, D., Chaumeil, P., Ramboer, A., Collada, C., Guevara, A., Cervera, M. T., Vendramin, G. G., Garcia, V., Frigerio, J. M., Echt, C., et al. (2004) Cross-species transferability and mapping of genomic and cDNA SSRs in pines. Theor. Appl. Genet. 109, 1204–1214.
Chang, S., Puryear, J., and Cairney, J. (1993) A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Reptr. 11, 113–116.
Dieringer, D., and Schlötterer, C. (2003) MICROSATELLITE ANALYZER (MSA): a platform independent analysis tool for large microsatellite data sets. Mol. Ecol. Notes 3, 167–169.
Ewing, B., and Green, P. (1998) Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194.
Ewing, B., Hillier, L., Wendl, M. C., and Green, P. (1998) Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185.
Felsenstein, J. (1989) PHYLIP – Phylogeny Inference Package (Version 3. 2). Cladistics 5, 164–166.
Faux, N. G., Bottomley, S. P., Lesk, A. M., Irving, J. A., Morrison, J. R., de la Banda, M. G., and Whisstock, J. C. (2005) Functional insights from the distribution and role of homopeptide repeat-containing proteins. Genome Res. 15, 537–551.
Fukuoka, H., Nunome, T., Minamiyama, Y., Kono, I., Namiki, N., and Kojima, A. (2005) Read2Marker: a data processing tool for microsatellite marker development from a large data set. Biotechniques 39, 472, 474, 476.
Green, P. (1999) Documentation for phrap and cross_match [online] Available from http://bozeman.mbt.washington.edu/phrap.docs/phrap.html [accessed 7 March 2007].
Harris, M. A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., et al. (2004) The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32, D258–D261.
Iseli, C., Jongeneel, C. V., and Bucher, P. (1999) ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol., 138–148.
Ishida, T. A., Hattori, K., Sato, H., and Kimura, M. T. (2003) Differentiation and hybridization between Quercus crispula and Q. dentata (Fagaceae): Insights from morphological traits, amplified fragment length polymorphism markers, and leafminer composition. Am. J. Bot. 90, 769–776.
Kado, T., Yoshimaru, H., Tsumura, Y., and Tachida, H. (2003) DNA Variation in a Conifer, Cryptomeria japonica (Cupressaceae sensu lato). Genetics 164, 1547–1559.
Kanno, M. (2005) Phylogeography and population genetic study of sect. Prinus of genus Quercus (Fagaceae) in Japan. Doctoral dissertation. Tohoku University, Sendai.
Katti, M. V., Ranjekar, P. K., and Gupta, V. S. (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol. Biol. Evol. 18, 1161–1167.
Kondo, H., Tahira, T., Hayashi, H., Oshima, K., and Hayashi, K. (2000) Microsatellite genotyping of post-PCR fluorescently labeled markers. Biotechniques 29, 868–872.
Kumar, S., Tamura, K., and Nei, M. (2004) MEGA 3: Integrated Software for Molecular Evolutionary Genetics Analysis and Sequence Alignment. Briefings in Bioinformatics 5, 150–163.
Lagercrantz, U., Ellegren, H., and Andersson, L. (1993) The abundance of various polymorphic microsatellite motifs differs between plants and vertebrates. Nucleic Acids Res. 21, 1111–1115.
La Rota, M., Kantety, R. V., Yu, J. K., and Sorrells, M. E. (2005) Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley [online]. BMC Genomics 6, 23. doi:10.1186/1471–2164-6-23.
Manos, P. S., and Stanford, A. M. (2001) The historical biogeography of Fagaceae: Tracking the tertiary history of temperate and subtropical forests of the Northern Hemisphere. Int. J. Plant Sci. 162, S77–S93.
Matyas, C. (1996) Climatic adaptation of trees: Rediscovering provenance tests. Euphytica 92, 45–54.
McKay, J. K., Christian, C. E., Harrison, S., and Rice, K. J. (2005) “How local is local?” – A review of practical and conceptual issues in the genetics of restoration. Restoration Ecol. 13, 432–440.
Metzgar, D., Bytof, J., and Wills, C. (2000) Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 10, 72–80.
Ministry of Agriculture, Forestry and Fisheries of Japan. (2002) Forestry District Survey Report (In Japanese).
Mishima, K., Watanabe, A., Isoda, K., Ubukata, M., and Takata, K. (2006) Isolation and characterization of microsatellite loci from Quercus mongolica var. crispula. Mol. Ecol. Notes 6, 695–697.
Morgante, M., Hanafey, M., and Powell, W. (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat. Genet. 30, 194–200.
Parkinson, J., Anthony, A., Wasmuth, J., Schmid, R., Hedley, A., and Blaxter, M. (2004) PartiGene–constructing partial genomes. Bioinformatics 20, 1398–1404.
Parkinson, J., Guiliano, D. B., and Blaxter, M. (2002) Making sense of EST sequences by CLOBBing them. BMC Bioinformatics 3, 31.
Pashley, C. H., Ellis, J. R., McCauley, D. E., and Burke, J. M. (2006) EST databases as a source for molecular markers: lessons from Helianthus. J. Hered. 97, 381–388.
Rowland, L. J., Mehra, S., Dhanaraj, A. L., Ogden, E. L., Slovin, J. P., and Ehlenfeldt, M. K. (2003) Development of EST-PCR markers for DNA fingerprinting and genetic relationship studies in blueberry (Vaccinium, section Cyanococcus). J. Am. Soc. Hort. Sci. 128, 682–690.
Rozen, S., and Skaletsky, H. J. (2000) Primer3 on the WWW for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology (eds.: S. Krawetz, and S. Misener), pp. 365–386. Humana Press, Totowa.
Rungis, D., Berube, Y., Zhang, J., Ralph, S., Ritland, C. E., Ellis, B. E., Douglas, C., Bohlmann, J., and Ritland, K. (2004) Robust simple sequence repeat markers for spruce (Picea spp.) from expressed sequence tags. Theor. Appl. Genet. 109, 1283–1294.
Saenz-Romero, C., Guzman-Reyna, R. R., and Rehfeldt, G. E. (2006) Altitudinal genetic variation among Pinus oocarpa populations in Michoacan, Mexico – Implications for seed zoning, conservation, tree breeding and global warming. For. Ecol. Manage. 229, 340–350.
Schubert, R., Mueller-Starck, G., and Riegel, R. (2001) Development of EST-PCR markers and monitoring their intrapopulational genetic variation in Picea abies (L.) Karst. Theor. Appl. Genet. 103, 1223–1231.
Tani, N., Takahashi, T., Iwata, H., Mukai, Y., Ujino-Ihara, T., Matsumoto, A., Yoshimura, K., Yoshimaru, H., Murai, M., Nagasaka, K., et al. (2003) A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence tags. Genetics 165, 1551–1568.
Temnykh, S., DeClerck, G., Lukashova, A., Lipovich, L., Cartinhour, S., and McCouch, S. (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11, 1441–1452.
Zhang, L., Yu, S., Cao, Y., Wang, J., Zuo, K., Qin, J., and Tang, K. (2006) Distributional gradient of amino acid repeats in plant proteins. Genome 49, 900–905.