Genes & Genetic Systems
Online ISSN : 1880-5779
Print ISSN : 1341-7568
ISSN-L : 1341-7568
Full papers
Evolution of GC content in the histone gene repeating units from Drosophila lutescens, D. takahashii and D. pseudoobscura
Yuko NakashimaAsako HigashiyamaAyana UshimaruNozomi NagodaYoshinori Matsuo
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2016 Volume 91 Issue 1 Pages 27-36

Details
ABSTRACT

A subset of histone genes (H1, H2A, H2B and H4), which are encoded along with H3 within repeating units, were analyzed in Drosophila lutescens, D. takahashii and D. pseudoobscura to investigate the evolutionary mechanisms influencing this multigene family and its GC content. Nucleotide divergence among species was more marked in the less functional regions. A strong inverse relationship was observed between the extent of evolutionary divergence and GC content within the repeating units; this finding indicated that the functional constraint on a region must be associated with both divergence and GC content. The GC content at 3rd codon positions in the histone genes from D. lutescens and D. takahashii was higher than that from D. melanogaster, while that from D. pseudoobscura was similar. These evolutionary patterns were similar to those of H3 gene regions. Based on these findings, we propose that the evolutionary mechanisms governing nucleotide content at 3rd codon positions tend to eliminate A and T nucleotides more frequently than G and C nucleotides. These changes might be the consequence of negative selection and would result in GC-rich 3rd codon positions. In addition, interspecific differences in GC content, which exhibited the same pattern for all histone genes, could be explained by different selection efficiencies that result from changes in population size.

INTRODUCTION

The genome of an organism usually comprises various types of genes and elements; these include single-copy genes, duplicated genes, dispersed multigene families, clustered multigene families, repetitive elements and transposons, among others (Adams et al., 2000). The mechanisms underlying how such genes and elements are evolving must be studied to fully understand the evolution of genomes and populations. The histone genes represent a multigene family that forms a tandem repetitive cluster. Repeating units comprising the five replication-dependent histone genes (H1, H2A, H2B, H3 and H4) are tandemly repeated approximately 110 times in Drosophila melanogaster (Lifton et al., 1977). The members of this multigene histone family in D. melanogaster have evolved in a concerted fashion (Coen et al., 1982; Matsuo and Yamazaki, 1989b; Kakita et al., 2003). Although the nucleotide sequences of the H3 genes have been analyzed in a wide range of Drosophila species (Matsuo, 2000a, 2000b, 2003, 2006; Tsunemoto and Matsuo, 2001), fewer studies have focused on the remainder of the repeated unit (Matsuo and Yamazaki, 1989a; Kremer and Hennig, 1990; Fitch and Strausbaugh, 1993; Schienman et al., 1998; Nagel and Grossbach, 2000; Tsunemoto and Matsuo, 2001; Kakita et al., 2003; Nagoda et al., 2005). DNA sequence data on the variability and divergence of multigene families have accumulated for the histone genes (Matsuo, 2006). Analyzing the histone genes from various Drosophila species could provide important information about mutation bias and population size, among other issues.

The evolutionary mechanisms that generate codon bias and GC content variation have been studied and discussed extensively (Ikemura, 1985; Bernardi and Bernardi, 1986; Li, 1987; Shields et al., 1988; Poh et al., 2012; Lawrie et al., 2013). Selective pressure and mutational bias are considered to be the main factors affecting GC content at 3rd codon positions (Moriyama and Hartl, 1993; Kliman and Hey, 1994; Akashi, 1995; Powell and Moriyama, 1997; Akashi et al., 2006; Heger and Ponting, 2007; Plotkin and Kudla, 2011). The factors affecting GC content at 3rd codon positions are relevant to concerted evolution (Matsuo, 2006). Studies of the GC content at 3rd codon positions within Xdh, Adh and several other genes in the Drosophila saltans group have shown that a low GC content in the saltans lineage can be explained by a shift in mutation pressure (Rodriguez-Trelles et al., 1999, 2000). However, analysis of the H3 gene region in Drosophila indicates that selection, not mutation bias, may be a major factor responsible for generating this codon bias (Matsuo, 2000b). Specifically, it was proposed that the effect of population size on selection efficiency is responsible for observed interspecific differences in GC content (Matsuo, 2000b, 2003, 2006). To identify and assess the contribution of population size to GC content, nucleotide sequence data for other histone genes and regions within the repeating unit should be analyzed. If the evolutionary pattern of the other histone genes is quite different from that of the H3 gene, the effect of population size would be small because all the genes in a genome are presumably affected simultaneously by any series of historical events experienced by a species. In this study, the histone genes linked within a repeating unit, H1, H2A, H2B and H4, are analyzed. Many genes should be studied to draw genome-wide conclusions.

To investigate the evolutionary mechanisms influencing the histone multigene family in Drosophila, DNA sequence data from as many Drosophila species as possible should be used. In this study, the H1, H2A, H2B and H4 histone genes within the repeating units were cloned from D. lutescens, D. takahashii and D. pseudoobscura and analyzed to investigate the evolutionary mechanisms that affect GC content of this gene family.

MATERIALS AND METHODS

Drosophila strains and PCR amplification

The Drosophila strains from D. lutescens, D. takahashii and D. pseudoobscura were kindly donated by Kyushu University, Japan. Genomic DNA was extracted from larvae with a DNA extraction kit (Sepa Gene Kit, Sanko Junyaku, Tokyo, Japan). Genomic DNA was amplified by polymerase chain reaction (PCR) with Takara Ex Taq (Takara Bio, Shiga, Japan) and the protocol of Saiki et al. (1985); briefly, each PCR cycle (n = 40) comprised denaturation at 94 ℃ for 1 min, annealing at 55 ℃ for 2 min, polymerization at 70 ℃ for 2 min, and an extension step of 5 sec. The nucleotide sequences of the primers used for PCR amplification were 5′-GGCGAGCGTGCTTAA-3′ (H3F3), 5′-GGCGAGCGAGCTTAA-3′ (H3F10) and 5′-TTGGTACGGGCCAT-3′ (H3R20). The H3F3 and H3R20 primer pair was used for D. pseudoobscura, and the H3F10 and H3R20 pair was used for D. takahashii and D. lutescens. Because the H3 genes of these species have already been investigated, primers were designed to amplify all of a typical single histone gene repeating unit except the H3 gene within the unit (Tsunemoto and Matsuo, 2001). Amplification of genomic DNA with these primers resulted in a single major band for each of the three Drosophila species. Arrangement, polarity and length of each histone gene within the clones derived from these bands were determined by sequencing, which confirmed that the clones were typical repeating units for the respective species. No protein-coding genes other than the five histones have been found in any of the cloned or analyzed repeating histone gene units (Matsuo and Yamazaki, 1989a).

Cloning and sequencing

PCR products were cloned into the plasmid vector PCR2.1 (Invitrogen, Carlsbad, CA, USA) and sequenced with a Dye Terminator sequencing kit (Applied Biosystems, Foster City, CA, USA) and an ABI 310 sequencer (Sanger et al., 1977). The sequencing strategy for the histone gene repeating unit was similar to that described previously (Tsunemoto and Matsuo, 2001) with some modifications. As shown by Kakita et al. (2003), heterogeneity among histone gene clones from a single Drosophila strain, most of which reflects the genetic difference among family members, is 0.6–0.7%. This level of difference was much lower than the interspecific differences observed here, which were approximately 10%.

DNA analysis

DNA sequences of the histone gene units from D. lutescens, D. takahashii and D. pseudoobscura were deposited in the DNA Data Bank of Japan (DDBJ). The accession numbers for the units are AB249649, AB249650 and AB249651 for D. lutescens, D. takahashii and D. pseudoobscura, respectively. The DNA sequences of other Drosophila histone gene units were obtained from DDBJ/GenBank: accession numbers X14215 for D. melanogaster (Matsuo and Yamazaki, 1989), X17072 and X52576 for D. hydei (Kremer and Hennig, 1990; Fitch and Strausbaugh, 1993), and AB192418 for D. americana (Nagoda et al., 2005). Analysis was done for Drosophila species whose histone genes have been experimentally studied, cloned and sequenced. In studying clusters of tandemly duplicated genes like the histone gene family, it is hard to know which genes (or repetitive units) are orthologous to each other. This problem arises because highly similar units are repeated many times, and variation in the number of repeats is found both within and between species. In most tandem clusters, as found here for the histone gene cluster, the family members show concerted evolution (Coen et al., 1982; Matsuo and Yamazaki, 1989b). Therefore, only one typical-length repeated unit was analyzed for each species. Multiple sequence alignment and phylogenetic analysis were conducted using the Clustal W program (Thompson et al., 1994) from DDBJ with the NJ method (Saitou and Nei, 1987). Bootstrap values were calculated for 1,000 replications. Since the core histones are highly conserved, most nucleotide substitutions in the coding regions are synonymous changes. In such cases, percentage is a better measure of divergence.

RESULTS

Structure of the histone gene repeating units from D. lutescens, D. takahashii and D. pseudoobscura

To determine the structure and nucleotide sequence of the repeating units encoding the histone genes in D. lutescens, D. takahashii and D. pseudoobscura, PCR products amplified from genomic DNA were cloned into a plasmid vector. The H3 genes from these species have been analyzed previously; therefore, the primers used here for DNA amplification were designed to cover all of a single repeating unit except the H3 gene (Fig. 1). The nucleotide sequences of inserts of 4,785 bp, 5,071 bp and 5,042 bp (including the H3 gene being 5,196 bp, 5,482 bp and 5,453 bp) for D. lutescens, D. takahashii and D. pseudoobscura, respectively, showed that the H4, H2A, H2B and H1 genes were coded in this order within a single repeating unit (Fig. 1). For each species, the locations and transcriptional polarities of the four histone genes within the respective units were identical to those found in other Drosophila species (Lifton et al., 1977; Matsuo and Yamazaki, 1989a; Kremer and Hennig, 1990; Fitch and Strausbaugh, 1993; Nagel and Grossbach, 2000; Tsunemoto and Matsuo, 2001; Kakita et al., 2003; Nagoda et al., 2005).

Fig. 1.

Organization of the histone gene repeating unit in D. melanogaster, D. pseudoobscura, D. lutescens and D. takahashii. The numbers indicate the size (bp) of the regions or units. Arrows show the direction of transcription. The insertion found in the spacer between H1 and H3 in D. takahashii is a repetitive one (light blue color).

Two spacer regions, one between the H4 and H2A genes in D. pseudoobscura and the other between the H1 and H3 genes in D. takahashii, were substantially larger than the corresponding regions in D. melanogaster; the difference in size between the D. takahashii spacer and the corresponding D. lutescens spacer was attributed to direct duplication of about 150 bp sequence (shown in light blue in Fig. 1). Although a number of small insertions and deletions were found in the spacer regions, the size differences between spacer regions between species were relatively small (Fig. 1). Similarly, the sizes of the coding regions were also highly conserved. The lengths of the H4, H2A and H2B genes were identical to those of D. melanogaster. However, compared to D. melanogaster, the H1 gene was 3 bp shorter in D. pseudoobscura and 18 bp shorter in D. lutescens and D. takahashii, but these differences in length did not result in any frameshift mutations. The amino acid sequences of H1 in these four species showed higher sequence divergence than that for the other histones.

Divergence, functional constraint and GC content of the regions in the histone gene repeating units

Comparisons of the nucleotide sequences of the repeating units from each of the three species to that of D. melanogaster revealed that divergence, represented as percent nucleotide difference, was larger in spacer regions than in coding regions (Fig. 2, Table 1). Moreover, the spacers in the 3’ regions (downstream of the coding regions) between H4-H2A and H2B-H1 were more divergent than the spacers in the 5’ regions (upstream of the coding region) between H3-H4 and H2A-H2B (Fig. 2, Table 1). The largest spacer, between H1-H3, showed the highest divergence. This observed pattern of divergence, in which functionally more important regions are more highly conserved than less important regions (Miyata and Yasunaga, 1981; Li, 1983), is consistent with the results obtained for the H3 gene region (Matsuo and Yamazaki, 1989b) and for the histone gene repeating units of the D. melanogaster species subgroup (Tsunemoto and Matsuo, 2001; Kakita et al., 2003). It is likely that negative selection was associated with the extent of nucleotide divergence between species (Kimura, 1983). A phylogenetic tree constructed with these repeating units that is consistent with trees constructed from the H3 gene (Matsuo, 2003) and other genes (Powell and DeSalle, 1995; Russo et al., 1995; Inomata et al., 1997) is shown in Fig. 3.

Fig. 2.

Nucleotide divergence in the histone gene repeating unit of D. lutescens, D. takahashii and D. pseudoobscura compared to D. melanogaster. Divergences in the spacer H3-H4, H4 gene, spacer H4-H2A, H2A gene, spacer H2A-H2B, H2B gene, spacer H2B-H1, H1 gene and spacer H1-H3 are shown separately.

Table 1. χ2 tests for heterogeneity of divergence with D. melanogaster in the coding vs. spacer regions, and in the 5’ vs. 3’ regions
D. melanogaster
D. lutescens
D. melanogaster
D. takahashii
D. melanogaster
D. pseudoobscura
Coding region#Different site (D)177203278
Identical site (I)163516271534
Total site (T)181218301812
Divergence (D/T)0.0980.1110.153
Spacer regionDifferent site (D)75413461137
Identical site (I)197319641508
Total site (T)272733102645
Divergence (D/T)0.2770.4070.430
χ2 test
Coding - Spacer
χ2 (d.f. = 1)133***326***227***
5’ region
(H3-H4, H2A-H2B)
Different site (D)78116158
Identical site (I)417426312
Total site (T)495542470
Divergence (D/T)0.1580.2140.336
3’ region
(H4-H2A, H2B-H1)
Different site (D)235399332
Identical site (I)615581519
Total site (T)850980851
Divergence (D/T)0.2770.4070.390
χ2 test
5’ region – 3’ region
χ2 (d.f. = 1)15.7***37.4***2.43

*** P < 0.001.

# H3 is not included.

Fig. 3.

Phylogenetic relationships of the histone gene repeating units in Drosophila. Nucleotide sequences of the repeating unit except for the H3 gene were analyzed to generate a phylogenetic tree by the NJ method. Bootstrap values for 1000 replications are indicated.

Separate comparisons for GC content in distinct regions of the repeating unit were conducted for each Drosophila species (Fig. 4). The results revealed that GC content was higher in the coding regions and lower in the spacer regions (Table 2). In the spacer regions, the GC content in the 5’ regions of genes was higher than in the 3’ regions (Table 2). While interspecific differences in GC content were apparent in the coding regions and to a lesser degree in the 5’ regions, no such differences were observed in the 3’ regions or in the longer H1-H3 spacer (Fig. 4, Table 2). A significant difference in the GC content between species was not found in the 3’-most region of the entire repeated unit, suggesting that GC content and nucleotide substitutions are in a steady state as was reported previously for the 3’ region of the H3 gene (Matsuo, 2000b).

Fig. 4.

Comparison of the GC content of the histone gene repeating units in six Drosophila species: D. melanogaster, D. lutescens, D. takahashii, D. pseudoobscura, D. americana and D. hydei. The GC content of the spacer H3-H4, H4 gene, spacer H4-H2A, H2A gene, spacer H2A-H2B, H2B gene, spacer H2B-H1, H1 gene and spacer H1-H3 is compared separately for each of the six indicated species.

Table 2. χ2 tests for heterogeneity of GC content in the coding vs. spacer regions, in the 5’ vs. 3’ regions, and between species
Species#MELLUTTAKPSEAMEHYDχ2 tests
between species
(d.f. = 5)
Coding region##GC91799999295888486135.4***
AT913813820875928948
SUM183018121812183318121809
%GC50.155.154.752.348.847.6
Spacer regionGC887974105110468918896.1
AT191519992208216319112059
SUM280229733259320928022948
%GC31.732.832.332.631.830.2
χ2 test
Coding - Spacer
χ2 (d.f. = 1)159***233***245***188***134***147***
5’ region
(H3-H4, H2A-H2B)
GC21122522119618820816.3*
AT311281284288347369
SUM522506505484535577
%GC40.444.543.840.535.136.0
3’ region
(H4-H2A, H2B-H1)
GC2893173414633423425.2
AT591661639911667776
SUM880978980137410091118
%GC32.832.434.833.733.930.6
χ2 test
5’ region – 3’ region
χ2
(d.f. = 1)
8.21**20.9***11.4***7.23**0.245.17*

*** P < 0.001; ** P < 0.01; * P < 0.05.

# MEL: D. melanogaster; LUT: D. lutescens; TAK: D. takahashii; PSE: D. pseudoobscura; AME: D. americana; HYD: D. hydei.

## H3 is not included.

Furthermore, for longer evolutionary periods, the extent of divergence between species and the GC content of distinct regions within the repeating units were confirmed to be inversely related; specifically, large interspecies divergence was associated with lower GC content (Table 3). These findings corroborated those of a comparative study between two closely related species, D. melanogaster and D. simulans (Tsunemoto and Matsuo, 2001), and suggest that the GC content of distinct regions within the repeat unit is subject to functional constraint.

Table 3. Correlation coefficients between the divergence and GC content in the regions of the histone gene repeating unit
Interspecies divergenceGC melaGC lutGC takGC pse
D. melanogaster – D. lutescens–0.97*b–0.97*
D. melanogaster – D. takahashii–0.97*–0.97*
D. melanogaster – D. pseudoobscura–0.97*–0.96*

a GC mel, GC lut, GC tak and GC pse mean the GC content for D. melanogaster, D. lutescens, D. takahashii and D. pseudoobscura, respectively.

b Correlation coefficients between the interspecies divergence and GC content in the regions of the histone gene repeating unit. The regions studied are H3-H4 spacer, H4 gene, H4-H2A spacer, H2A gene, H2A-H2B spacer, H2B gene, H2B-H1 spacer, H1 gene and H1-H3 spacer.

* Significant values with 7 d.f., P < 0.001.

Interspecies differences in GC content at 3rd codon positions within histone genes

The GC content of an entire coding region is affected by multiple factors such as coding proteins, constraint for amino acids and the presence of codon bias. However, the interspecies difference in GC content in the histone genes is mainly due to 3rd codon positions within the genes, because the amino acid sequences of histones are highly conserved between species. Therefore, the GC content at 3rd codon positions was subject to further analysis. The GC content at 3rd codon positions within histone genes in these Drosophila species was compared (Fig. 5). Marked differences in GC content at 3rd codon positions were evident between histone genes within a species; for example, the %GC values were higher for the H2B gene than the H1 gene, as found previously (Fitch and Strausbaugh, 1993; Tsunemoto and Matsuo, 2001). Nevertheless, interspecies differences were also apparent for histone genes (Table 4). For each histone gene studied (H4, H2A, H2B and H1), the GC content at 3rd codon positions in D. lutescens and D. takahashii was higher than that in D. melanogaster; although the interspecies differences for H2A were not significant, the differences for H4, H2B and H1 were significant at the 1% level (Table 4). For comparison, GC content data for the H3 genes in these species (Matsuo, 2003) are shown along with the corresponding data from D. americana (Nagoda et al., 2005) and D. hydei (Fitch and Strausbaugh, 1993) in Fig. 5. D. americana and D. hydei were chosen for comparison because these species are distantly related to the melanogaster subgroup and DNA sequence data for their histone gene repeating units were available. For each histone gene, interspecific differences in GC content showed trends similar to those observed previously for the H3 gene (Fig. 5); these findings indicated that the entire repeating unit is evolving in a similar manner.

Fig. 5.

Comparison of GC content at 3rd codon position of the histone genes in six Drosophila species. Data for the H3 genes are from Matsuo (2003).

Table 4. χ2 tests for heterogeneity of GC content at 3rd codon position between histone genes, and between six species
GeneSpecies#MELLUTTAKPSEAMEHYDχ2 test
between species
(d.f. = 5)
H4GC3rd58666455424419.5**
AT3rd453739486159
SUM103103103103103103
%GC3rd56.364.162.153.440.842.7
H2AGC3rd67788270626310.8
(P = 0.06)
AT3rd574642546261
SUM124124124124124124
%GC3rd54.062.966.156.550.050.8
H2BGC3rd78949375777017.9**
AT3rd452930484653
SUM123123123123123123
%GC3rd63.476.475.661.062.656.9
H1GC3rd12414914514011810920.4***
AT3rd132101105117132140
SUM256250250257250249
%GC3rd48.459.658.054.547.243.8
H3GC3rd79898575616617.4**
AT3rd574751617570
SUM136136136136136136
%GC3rd58.165.462.555.244.948.5
TotalGC3rd40647646941536035275.4***
AT3rd336260267328376383
SUM742736736743736735
%GC3rd54.764.763.755.948.947.9
χ2 test
between histone genes
χ2
(d.f. = 4)
8.610.5*11.6*1.813.2*7.3

*** P < 0.001; ** P < 0.01; * P < 0.05.

# MEL: D. melanogaster; LUT: D. lutescens; TAK: D. takahashii; PSE: D. pseudoobscura; AME: D. americana; HYD: D. hydei.

DISCUSSION

Analysis of nucleotide divergence in the histone gene repeating unit revealed that the coding regions of histone genes are more highly conserved than the sequences within the spacer regions. Furthermore, within the spacer regions, the 5’ regions of the genes are more highly conserved than the 3’ regions. It is thus apparent that the functional constraints associated with protein coding and with the regulation of gene expression are associated with the degree of divergence. The relationship between divergence and functional constraint suggested that negative selection is acting and that it has a marked influence on nucleotide divergence. The finding that the GC content is inversely related to the degree of divergence for each distinct region within the repeating unit was consistent with findings from comparisons of histone gene repeating units in two more closely related species, D. melanogaster and D. simulans (Tsunemoto and Matsuo, 2001). These results suggested that negative selection has also affected GC content in these regions. The spacers have a low GC content, implying that, like other non-functional regions such as introns and transposons, they are AT-rich (Petrov and Hartl, 1999). Since functional constraints were expected to be weak in these regions, the effect of negative selection, if any, is likely to be small. Consequently, it is possible that the GC content in these regions is influenced by mutation bias, if such a bias exists. The AT richness in these regions suggests that the pattern of mutation is biased toward the A/T direction; this supposition is consistent with findings from the Drosophila genes (Petrov and Hartl, 1999). Furthermore, interspecies differences in the GC content of these less functionally constrained regions are relatively small and not significantly different, suggesting that the relationship between GC content and nucleotide substitution is nearly in a steady state. This conclusion is consistent with previous results on patterns of nucleotide substitution in the 3’ region of the H3 gene (Matsuo, 2000; Tsunemoto and Matsuo, 2001). These results also indicated that the pattern of mutation is similar in the four Drosophila species studied (Matsuo, 2003). The coding and 5’ regions of these histone genes are GC-rich; this finding indicated that the effect of negative selection results in the elimination of A or T more frequently than G or C, and that A or T in these regions must be more deleterious than G or C. It is not clear why GC content becomes high in the regions subject to strong functional constraint. One possible explanation for this finding is that when the base content is dissimilar to the mutation bias, the probability of generating significant combinations of nucleotide sequence accidentally will be lower.

Selective pressure and mutational bias are considered as putative main factors affecting GC content at 3rd codon positions (Ikemura, 1985; Bernardi and Bernardi, 1986; Shields et al., 1988; Moriyama and Hartl, 1993; Kliman and Hey, 1994; Akashi, 1995; Powell and Moriyama, 1997; Heger and Ponting, 2007; Plotkin and Kudla, 2011). GC richness at 3rd codon positions is difficult to explain by mutation pattern alone because such patterns are likely to be biased toward A/T (Powell and Moriyama, 1997). Therefore, at 3rd codon positions, A or T must be eliminated at a higher frequency than are G or C, possibly by negative selection, in the same way as has been observed in other regions, such as the 5’ region. How, then, might interspecific differences in GC content at 3rd codon positions be explained? Studies of GC content at 3rd codon positions of Xdh, Adh, and several other genes in the Drosophila saltans group revealed that a low GC content in the saltans lineage could be explained by a shift in mutation pressure (Rodriguez-Trelles et al., 1999, 2000). However, as discussed above, the mutation pattern within the histone gene region may not differ among the species studied (Matsuo, 2003). Therefore, selection is the most likely explanation for the observed interspecific differences in GC content. Under the selection hypothesis, there are two possible models for explaining the observed interspecific divergence in GC content at 3rd codon positions. One model is that the direction of codon bias differs between species, and that this difference results in differences in GC content at 3rd codon positions (Poh et al., 2012). The alternative model is that divergence in GC content at 3rd codon positions could be caused by differences in selection efficiency (Matsuo, 2003, 2006). Although the former model cannot be excluded completely, the second model is preferable for the following reason: it is reasonable to assume that the same selection pressure as in other genomic regions is also working on the synonymous sites in coding regions; therefore, less functional regions become more AT-rich than the more functional regions in most Drosophila species. The intensity of selection for GC content at 3rd codon position is essentially similar in meaning to codon choice and is likely to be small enough to be affected by the size of a population (Hartl et al., 1994). In a model depicting slightly deleterious mutations (Ohta, 1972), selection was observed to act effectively when populations were large. In such instances, A or T residues at 3rd codon positions would be eliminated, and these sites would effectively become GC-rich. Conversely, for smaller populations, selection does not act effectively, and GC content at 3rd codon positions has moved toward that which is more influenced by the mutation pattern, becoming slightly AT-rich. The former scenario is likely to have occurred in the ancestor of D. takahashii and D. lutescens, and the latter scenario in the ancestor of D. hydei and D. americana. Moreover, the effect of population size will not only be manifested in a single gene, but will have occurred simultaneously for all genes in a genome. This means that the aforementioned hypothesis can be tested by analyzing the GC content of other genomic genes in these species. Similar interspecific patterns in GC content at 3rd codon positions of all histone genes in a repeating unit as well as of other Drosophila genes (Matsuo, 2000a) would provide supporting evidence for the effect of population size. For determining the mechanism of concerted evolution of the histone multigene family, a negative selection and size effect on codon usage as well as gene conversion and unequal crossing-over will be important factors.

ACKNOWLEDGMENTS

We would like to thank the anonymous referees for their important comments. This research was supported by a Grant-in-Aid for Scientific Research awarded to Y. M. from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

REFERENCES
 
© 2015 by The Genetics Society of Japan
feedback
Top