Anthropological Science
Online ISSN : 1348-8570
Print ISSN : 0918-7960
ISSN-L : 0918-7960
Original Articles
Constant sex difference across populations in liability of nonmetric traits
AKIRA TAGAYA
著者情報
ジャーナル フリー HTML

2020 年 128 巻 2 号 p. 57-70

詳細
Abstract

Studies have revealed the existence of statistically significant sex differences in the frequency of nonmetric traits, but no agreement seems to exist about their variability among populations. This problem was examined using the multifactorial threshold model. Considering the assumption of additive effects of factors on the liability and the nature of effect of sex difference on the development of nonmetric traits, it would be reasonable to assume that the sex difference in the mean of liability is constant across populations. This hypothesis was tested and the magnitude of sex difference was examined using the world-wide dataset collected by Ossenberg and the dual-liability threshold model formulated by the author with a modification to accommodate side difference in the probability of trait occurrence. The data were divided into 16 samples regarded as randomly sampled from regional populations. The data of 31 bilateral traits were analyzed using maximum likelihood estimation procedures. After confirming the homogeneity of the variance of liability between sexes and across populations, the homogeneity and significance of sex difference in the mean of liability were tested. The results indicate the homogeneity of sex difference across populations. The assumed constant sex difference was statistically significant in 17 traits at the 1% level, and its magnitude exceeded half the averaged distance between eight groups of populations in 12 traits. Population comparisons without distinguishing sex are justifiable if they use the traits with enough weak sex difference in comparison with population differences. Since the sex difference has proved to be basically constant across populations, the estimates of the assumed constant sex difference reported in this study would provide references for selecting traits appropriate for each comparison. The Breslow–Day test of homogeneity of sex difference indicated the inapplicability of the genotype model to the data, supporting Ossenberg’s proposal for the use of side counts.

Introduction

Although Berry and Berry (1967) proposed that no substantial difference exists between sexes in the frequency (rate of positive expression) of skeletal nonmetric traits, statistically significant differences have been reported (Corruccini, 1974; Berry, 1975; Perizonius, 1979; Mouri, 1976, 1988). There seems to have been no agreement, however, about the variability of sex difference across populations. Berry (1975) analyzed the sex difference of cranial nonmetric traits in four populations, and concluded that there was no stable tendency in the direction of sex difference. However, this is rather strange under the multifactorial threshold model, which is recognized to be the most appropriate mathematical model for explaining the expression of a nonmetric trait. This model consists of a latent variable, the ‘liability,’ and a fixed value, the ‘threshold,’ and a trait is positively expressed when its liability exceeds its threshold. The liability is assumed to additively reflect the effects of numerous factors, hence is normally distributed (Falconer, 1960). It would be natural to consider that sex should also additively affect the liability, which means that the sex difference in the liability should be constant across populations.

Mouri (1988) used Cochran’s test for multiple contingency tables to analyze 30 cranial nonmetric traits and detected statistically significant male-dominant occurrence in seven traits, and female-dominant occurrence in another seven traits; these results were consistent across 8–13 populations. This means that the direction of the difference was constant across populations, at least for these traits, although the geographic background of the study populations was limited to the Japanese archipelago and adjacent areas, and no judgement could be made concerning whether the sex difference in these traits was of constant magnitude.

The dual-liability threshold model (DLM) formulated by the present author (Tagaya, 2019) would be useful for examining this problem as it enables us to test biologically meaningful hypotheses about sex difference and its relationship with populations. The DLM is a combination of two standard multifactorial threshold models, each explaining the expression of a bilateral trait on either side. Each of the two liabilities is a sum of mutually independent two components: (1) the inter-individual component shared by sides explaining the inter-side correlation in the trait expression and (2) the inter-side component whose population variance assumed to be identical between sides and constant across populations is used as the unit of liability. This unit also provides a reference for the evaluation of the magnitude of sex difference. Since the inter-side component is assumed to be normally distributed due to its multifactorial nature, the magnitude of the sex difference must be much smaller than 1 for sex to be regarded as one of the numerous additive factors of liability in the multifactorial threshold model. A hypothesis about the sex difference is expressed as a constraint on the parameters of the DLM, and tested for its goodness of fit to the actual data using the maximum likelihood estimation (MLE) procedure. This method (the DLM–MLE method) will be used in this study to test hypotheses about the homogeneity of sex difference in the mean of liability across populations.

As Mouri (1988) showed, it is also possible to test the significance of sex difference across populations using a non-parametric method, the Cochran–Mantel–Haenszel (CMH) method, developed by Cochran (1954), Mantel and Haenszel (1959), and Breslow and Day (1980). The CMH method defines the sex difference substantially in accordance with the genotype model, and tests the hypotheses in similar fashion to the DLM–MLE method. Therefore, a comparison of the results between the two methods should provide useful information about the nature of the sex difference of nonmetric traits and the appropriateness of the DLM and genotype model for explaining the expression of non-metric traits.

This study tests the hypothesis of constant sex difference in the mean of liability across populations and, under this hypothesis, evaluates the magnitude and statistical significance of the assumed constant sex difference using the DLM–MLE method and the database published by Ossenberg (2013a, b) on the Internet. In addition, the results are compared with the corresponding results obtained with the CMH method.

Materials and Methods

Preparation of data

The data of the bilateral nonmetric traits listed in Table 1 were included in the analyses. These data were taken from the database published by Ossenberg (2013a, b). The data were divided into 16 groups, each of which was regarded to be randomly sampled from a population (listed in Table 2). An effort was made to limit and level the within-group heterogeneity, taking into consideration their geographical and historical backgrounds and the reports on nonmetric and craniometric population affinities and regional variabilities (Dodo and Ishida, 1987; Ossenberg, 1994; Shigematsu et al., 2004; Ossenberg et al., 2006; Hanihara, 2008; Hubbe et al., 2009).

Table 1 Bilateral nonmetric traits included in analyses
Code Definition
OMB Occipitomastoid ossicle
AST Asterionic ossicle
PNB Parietal notch bone
POS Posterior condylar canal absent
HYP Hypoglossal canal bridged or double
PCP Paracondylar process
ICC Intermediate condylar canal
SQS Parietal process of temporal squama
SPS Squamoparietal synostosis
MAR Marginal foramen of tympanic plate
TYM Tympanic dehiscence
FSP Dehiscent wall of foramen spinosum or foramen ovale
LPF Foramen in lateral pterygoid plate
CIV Pterygospinous bridge complete (foramen of Civinini)
PTB Pterygobasal spur or bridge
CLN Clinoid bridging
SOF Supraorbital foramen
FRG Frontal groove(s)
TRS Trochlear spur
OPT Accessory optic canal
ORB Orbital suture variant
CON Infraorbital suture variant
JAP Transversozygomatic suture
M3U Upper third molar congenitally absent
MEN Accessory mental foramen
MHB Mylohyoid bridge
BUC Retromolar foramen
M3L Lower third molar congenitally absent
TRM Three-rooted mandibular first molar
ATA Atlas bridging, condylar process to posterior arch
ATB Atlas bridging, condylar to transverse process
Table 2 Grouping and sample size of the data included in analyses
(Group) Sample size Composition
 Population M F Total
(Africa) (195) (167) (362)
 Khoisan 24 13 37 San and Khoe speakers
 North Africa 48 32 80 Mainly from Kerma site in Sudan
 Sub-Sahara 89 94 183 East, West, and South Africans excluding Khoisans
 US African 34 28 62 US inhabitants of African origin
(Indo-Europe) (416) (316) (732)
 Europe 344 261 605 Ancient and modern Europeans including British Canadians
 India 72 55 127 Housed in University of Alberta
(Australia) (30) (23) (53)
 Australia 30 23 53 Australian Aboriginals including Tasmanians
(Jomon-Ainu) (204) (199) (403)
 Jomon 122 136 258 From Honshu and Hokkaido, including Epi-Jomon
 Ainu 82 63 145 Mostly from Hokkaido
(NE Asia) (534) (303) (837)
 Japan 444 264 708 Yayoi, medieval, Edo, and modern Japanese
 NE Asia 90 39 129 Manchuria and Mongolia
(Circum-Arctic) (1347) (1425) (2772)
 Siberia 96 92 188 Tungus, Chukchi, Okhotsk, and Yukaghir
 Aleutian 212 216 428 Mostly Central and Eastern Aleuts
 Arctic 1039 1117 2156 Mainly Inupik and Yupik speakers
(America) (1076) (864) (1940)
 America 1076 864 1940 Athapaskans and non-Arctic Native Americans
(Polynesia) (73) (51) (124)
 Polynesia 73 51 124 Marquesans, New Zealand Maori, and Moriori
Total 3875 3348 7223

Any positive expression was regarded as a trait occurrence. The counts were recorded for four types of trait expression: (1) on neither side, (2) on the left side only, (3) on the right side only, and (4) on both sides. The set of any values (counts, frequencies, probabilities, etc.) corresponding to these four types of trait expression in this order will be called a ‘quartet,’ and the quartet of values a, b, c, and d will be denoted as {a, b, c, d} hereafter. To ensure the determinability of the DLM parameters (by avoiding zero frequency of trait occurrence), {a + 1/8, b + 1/4, c + 1/4, d + 1/8} was used instead of the count quartet {a, b, c, d} for analyses. This is to assume that there was a chance of observing an additional case at the probabilities {1/8, 1/4, 1/4, 1/8} for respective types although it did not happen. As a result, the side-count trait frequency is calculated as (b + d + 3/8)/(a + b + c + d + 3/4) for the left side and (c + d + 3/8)/(a + b + c + d + 3/4) for the right side, which happens to be equivalent to Anscombe’s (1948) modification in the angular transformation. Samples with a size for a trait of less than 9 were excluded from the analyses of that trait. The side-count trait frequencies are given in Table 3.

Table 3 Side-count frequencies for each population (data based on eight or fewer individuals were not included in analyses)
Khoisan North Africa Sub-Sahara US African Europe India Australia Jomon
M F M F M F M F M F M F M F M F
Nh* 18 12 40 28 72 64 29 25 237 175 69 54 26 20 35 40
OMB .188 .115 .122 .204 .136 .090 .021 .000 .035 .037 .069 .085 .200 .238 .196 .173
AST .088 .115 .284 .214 .236 .245 .150 .120 .153 .094 .215 .160 .304 .159 .296 .331
PNB .194 .154 .273 .268 .192 .223 .133 .115 .188 .166 .179 .222 .125 .114 .329 .204
POS .389 .458 .429 .375 .291 .317 .338 .393 .391 .347 .157 .209 .433 .370 .179 .181
HYP .056 .182 .109 .194 .094 .113 .176 .107 .205 .196 .143 .173 .017 .000 .132 .160
PCP .088 .000 .098 .048 .102 .020 .088 .036 .166 .117 .290 .182 .077 .045 .167 .204
ICC .278 .125 .302 .371 .315 .269 .333 .268 .316 .281 .324 .255 .222 .136 .167 .400
SQS .026 .042 .068 .130 .051 .091 .069 .111 .027 .010 .014 .009 .033 .023 .000 .016
SPS .000 .000 .000 .018 .000 .005 .045 .054 .011 .002 .000 .009 .017 .000 .000 .000
MAR .211 .083 .128 .154 .084 .080 .121 .152 .045 .049 .143 .157 .143 .068 .059 .014
TYM .000 .462 .152 .150 .169 .272 .121 .231 .127 .182 .164 .213 .207 .159 .183 .295
FSP .118 .346 .326 .259 .214 .359 .106 .125 .171 .264 .206 .179 .339 .342 .200 .200
LPF .000 .100 .135 .080 .068 .043 .088 .036 .073 .070 .043 .036 .021 .000
CIV .000 .000 .011 .000 .011 .000 .015 .000 .062 .030 .036 .009 .000 .000 .025 .052
PTB .342 .308 .433 .167 .563 .394 .515 .321 .130 .072 .136 .118 .283 .217 .164 .086
CLN .313 .125 .163 .115 .201 .177 .103 .115 .116 .144 .100 .027 .148 .068 .071 .065
SOF .205 .231 .174 .417 .258 .237 .103 .161 .282 .314 .379 .245 .117 .196 .080 .120
FRG .643 .654 .489 .552 .494 .583 .467 .519 .276 .419 .275 .327 .133 .024 .148 .345
TRS .100 .125 .085 .155 .107 .120 .132 .161 .093 .092 .129 .109 .050 .000 .007 .034
OPT .000 .091 .015 .060 .026 .027 .029 .000 .031 .030 .043 .018 .000 .000 .000 .000
ORB .000 .000 .235 .167 .170 .045 .278 .273 .326 .231 .404 .382
CON .286 .111 .211 .333 .138 .057 .509 .381 .308 .269 .042 .024
JAP .250 .000 .061 .068 .117 .148 .067 .148 .143 .126 .123 .167 .019 .088 .824 .840
M3U .000 .136 .067 .086 .080 .020 .188 .208 .295 .310 .246 .284 .000 .000 .111 .109
MEN .225 .227 .122 .069 .218 .257 .129 .100 .075 .059 .043 .082 .093 .111 .240 .109
MHB .190 .292 .085 .065 .157 .056 .097 .100 .070 .083 .043 .064 .120 .028 .206 .107
BUC .095 .000 .092 .097 .045 .000 .016 .000 .042 .028 .015 .027 .019 .028 .017 .032
M3L .025 .000 .050 .224 .077 .048 .167 .205 .264 .267 .167 .192 .022 .059 .106 .076
TRM .000 .000 .000 .017 .000 .000 .000 .000 .006 .000 .014 .009 .023 .029 .046 .022
ATA .068 .081 .000 .000
ATB .005 .006 .000 .000
Trait Ainu Japan NE Asia Siberia Aleutian Arctic America Polynesia
M F M F M F M F M F M F M F M F
Nh* 58 39 230 144 81 34 55 47 132 123 428 435 559 467 53 38
OMB .169 .172 .126 .107 .149 .095 .203 .238 .266 .203 .195 .177 .192 .194 .227 .186
AST .231 .136 .140 .127 .214 .189 .194 .184 .248 .082 .201 .126 .230 .162 .295 .217
PNB .186 .198 .272 .183 .190 .141 .225 .133 .222 .174 .284 .255 .169 .154 .176 .087
POS .192 .085 .302 .245 .275 .355 .331 .338 .178 .199 .157 .159 .128 .134 .358 .279
HYP .222 .233 .114 .092 .107 .128 .074 .144 .233 .213 .178 .187 .186 .199 .159 .098
PCP .274 .276 .307 .209 .281 .129 .310 .238 .188 .134 .294 .220 .220 .158 .188 .122
ICC .324 .226 .229 .197 .279 .158 .305 .231 .221 .247 .259 .258 .361 .373 .270 .207
SQS .025 .008 .023 .019 .022 .000 .038 .012 .023 .014 .027 .020 .109 .069 .014 .000
SPS .006 .000 .005 .000 .017 .000 .027 .012 .045 .012 .010 .003 .032 .006 .028 .000
MAR .071 .123 .126 .153 .151 .066 .126 .136 .300 .247 .116 .127 .263 .251 .169 .289
TYM .125 .275 .271 .301 .227 .329 .204 .219 .556 .555 .239 .366 .307 .401 .079 .138
FSP .165 .246 .170 .169 .161 .372 .146 .190 .187 .228 .185 .263 .127 .176 .441 .352
LPF .147 .189 .088 .048 .119 .063 .063 .045 .085 .086 .092 .063 .143 .117 .100 .033
CIV .038 .009 .028 .012 .044 .051 .039 .006 .051 .022 .079 .059 .091 .045 .064 .011
PTB .260 .250 .191 .114 .156 .066 .044 .036 .135 .094 .114 .099 .256 .173 .136 .056
CLN .048 .132 .052 .030 .064 .077 .093 .057 .261 .159 .147 .096 .172 .171 .023 .033
SOF .196 .208 .373 .375 .523 .513 .549 .607 .542 .674 .583 .614 .491 .563 .275 .459
FRG .158 .179 .222 .308 .320 .329 .124 .203 .310 .416 .188 .284 .285 .350 .271 .543
TRS .073 .034 .109 .070 .063 .092 .066 .055 .037 .036 .019 .009 .055 .055 .106 .163
OPT .014 .038 .063 .026 .084 .128 .114 .122 .048 .070 .015 .031 .036 .040 .115 .073
ORB .122 .163 .373 .254 .463 .282 .346 .278 .174 .132 .155 .089 .185 .098 .203 .162
CON .139 .207 .303 .307 .463 .452 .417 .611 .344 .408 .333 .320 .152 .211 .202 .353
JAP .640 .658 .310 .401 .302 .343 .268 .363 .177 .276 .197 .240 .131 .182 .089 .115
M3U .284 .257 .377 .517 .266 .317 .319 .438 .089 .115 .128 .133 .072 .091 .183 .286
MEN .223 .143 .144 .095 .176 .121 .125 .140 .077 .119 .106 .082 .113 .076 .197 .037
MHB .134 .042 .046 .028 .014 .017 .123 .074 .262 .252 .117 .093 .224 .216 .091 .056
BUC .182 .122 .057 .052 .049 .179 .000 .042 .126 .059 .094 .083 .054 .049 .048 .056
M3L .256 .204 .333 .396 .180 .229 .287 .375 .121 .157 .240 .197 .102 .136 .047 .200
TRM .068 .056 .200 .198 .283 .283 .274 .207 .407 .252 .223 .202 .078 .076 .019 .045
ATA .104 .000 .014 .038 .109 .127 .110 .018 .133 .057
ATB .000 .000 .000 .019 .091 .137 .172 .053 .122 .079
*  Harmonic mean of sample sizes over traits (rounded to integer value)

DLM accommodating side difference

The DLM proposed by the present author (Tagaya, 2019) is used here with a modification to accommodate the side difference because significant side differences are observed in the data used for analyses. This modification introduces a parameter δ for left-side dominance (i.e., right-side dominance if δ is negative) in the probability of trait expression, and assumes that the inter-side component of liability is distributed as N(δ, 1) for the left side and N(−δ, 1) for the right side, where N(m, v) denotes a normal distribution with mean m and variance v. DLM(V, M, δ) will hereafter denote a DLM with its inter-individual component of liability (IICL) distributed as N(M, V) and the left-side-dominance parameter being δ. All the components of liability are mutually independent, as in the original version of DLM. Figure 1 shows the distribution of the four types of trait expression for given values of IICL under DLM(2, −1, 0.1) calculated using the function DLM_Q_Dist in Appendix 2.

Figure 1

Distribution of four types of trait expression over the inter-individual component of liability under DLM(2, −1, 0.1). Each parenthesized value in the legend is the probability of occurrence of the type of expression obtained as the area under the curve.

Existence of complete-fit DLM for observed data

The existence of a complete-fit DLM, DLM(V, M, δ), depends on the estimate of inter-side correlation coefficient of liability, R, obtained by the tetrachoric procedure, as follows. When V is given, the values of M and δ are determined from the right- and left-side rates of trait occurrence. Noting that R = V/(V + 1), V is calculated as V = R/(1 − R). Therefore, a complete fit DLM exists if 0 ≤ R < 1, which is equivalent to 0 ≤ ϕ < 1 for the phi correlation coefficient, ϕ. The data with enough large sample size should satisfy this condition if the model is appropriate.

Hypotheses about sex difference as constraints on DLM parameters

The following hypotheses were tested under DLM(Vij, Mij, δij), where ij indicates sex j (j = 1 for male and j = 2 for female) of population i (i = 1 to 16).

H1: Vij = Vi ( equality of variance of IICL between sexes in every population)
H2: Vi = V ( homogeneity of variance of IICL across populations under H1)

The variance Vij is expected to reflect the variability of factors affecting IICL. The magnitude of such variability is assumed to be equal between sexes under H1, and to be constant across populations under H2. These seem to be biologically natural assumptions because most of the numerous factors affecting IICL should be common to sexes and populations. Statistically, these are the premise for the inter-sex and inter-population comparisons based on the liability. Let di denote the sex difference in the mean of IICL for population i. Under H1 and H2, the homogeneity of sex difference across populations and significance of the assumed constant sex difference are examined by testing the following hypotheses.

H3: di = Mi1Mi2 = d ( homogeneity of sex difference across populations)
H4: d = 0 ( the assumed constant sex difference being null)

Procedures for testing hypotheses

The MLE method was used to estimate the parameter values of DLM(Vij, Mij, δij) and obtain the test statistic for each hypothesis (i.e., constraint on Vij and Mij). Appendix 1 illustrates the procedure for these analyses. The calculations used the programs coded be the author in Microsoft Visual Basic for Applications (VBA 7.1) and the Solver of Microsoft Excel 2016. Appendix 2 provides the VBA programs with instructions for their usage in the MLE procedures including the calculation of the confidence interval (CI) of a parameter. The hypotheses were tested for each trait, and collectively over all the 31 traits. The collective test used the sums of chi-square values and corresponding degrees of freedom. This procedure is justifiable because the correlations between the traits were mostly very low. (The root mean square of Spearman’s rho coefficient over all combinations of traits based on the pooled data was 0.035 for the left-side occurrence, 0.036 for the right-side occurrence, and 0.039 for the individual-count trait occurrence.)

Comparison of sex difference with group difference

The magnitude of sex difference in the mean of liability (d) estimated under H3 was evaluated by its ratio to the group difference in the mean of liability (Dg) estimated under H3. This Dg does not depend on sex because the sex difference is assumed to be constant under H3. In this study, Dg was defined as the weighted root mean square (WRMS) of inter-group difference in the mean of liability over all pairs of groups with the harmonic mean of sample sizes being used as the weight of each pair. The mean of liability of a group was obtained as the weighted average of the means of member populations. The group, not the population, was used as the unit in order for the value of Dg to reflect the world-wide variability.

Non-parametric tests

The CMH method uses the male-to-female odds ratio (OR) to define the sex difference. Let Ri denote the male-to-female OR of trait occurrence for population i. The hypothesis of constant sex difference across populations is expressed as Ri = R, which is tested by the Breslow–Day test of homogeneity of OR. The CMH estimate of common OR gives the estimate of this assumed constant R. The null hypothesis R = 1 for testing significance of the sex difference is examined by the CMH test of the conditional independence.

The Breslow–Day test of the homogeneity of OR can be regarded as a test of applicability of the genotype model to nonmetric data. Under the genotype model, the sex difference in the trait frequency must be explained by the difference in penetrance if the genotype frequency is equal between sexes, as is usually considered. On the other hand, to justify the population comparison based on trait frequencies, the penetrance must be constant across populations. (This applies to both the side-count and individual-count penetrance values because the latter is expressed as 2pp2, using the former value, p.) Numerical simulations show that the male-to-female OR of trait occurrence is fairly stable for usual ranges of genotype frequency and penetrance if the penetrance is constant in each sex. Therefore, a substantial homogeneity across populations should be observed in the male-to-female OR if the genotype model is applicable to the trait expression, although there is a possibility that both the models exhibit sufficient goodness of fit because the liability-based homogeneity and the OR-based homogeneity of sex difference are numerically close to each other if the inter-population variability of trait frequency is not so large.

The analyses based on the CMH method used individual-count data and IBM SPSS Statistics version 25.

Results

Homogeneity of variance of liability

The constancy of variance of IICL was tested between sexes first and then across populations (Table 4). No trait exhibited statistically significant sex difference in the variance of IICL. The sum of chi-square values indicated that H1 was acceptable when tested collectively over traits (P = 0.511). Statistically significant heterogeneity of variance among populations was exhibited by only four traits (FRG at the 1% level, and BUC, AST, and FSP at the 5% level), none of which satisfied Benjamini and Hochberg’s (1995) criterion of q < 0.05 to control the false discovery rate (FDR). The sum of the chi-square values indicated that H2 was acceptable when tested collectively over traits (P = 0.217).

Table 4 Test of hypotheses about variance of liability: equality between sexes (H1) and across populations (H2)
Trait H1 (Vij = Vi) H2 | H1 (Vi = V) H1 & H2 (Vij = V)
χ2 d.f. P V χ2 d.f. P χ2 d.f. P
OMB 15.51 16 0.488 1.196 8.04 15 0.922 23.548 31 0.829
AST 21.80 16 0.150 1.220 26.09 15 0.037 * 47.885 31 0.027 *
PNB 8.37 16 0.937 1.178 23.42 15 0.076 31.794 31 0.427
POS 18.10 16 0.318 0.786 15.39 15 0.424 33.488 31 0.347
HYP 23.71 16 0.096 0.711 15.55 15 0.413 39.265 31 0.146
PCP 18.37 16 0.303 2.143 14.43 15 0.493 32.802 31 0.379
ICC 24.29 16 0.083 0.905 16.55 15 0.347 40.841 31 0.111
SQS 15.98 16 0.455 1.409 12.38 15 0.650 28.359 31 0.603
SPS 2.63 16 1.000 1.949 3.56 15 0.999 6.190 31 1.000
MAR 20.57 16 0.196 1.770 11.95 15 0.683 32.521 31 0.392
TYM 14.25 16 0.580 2.507 23.23 15 0.079 37.482 31 0.196
FSP 18.58 16 0.291 1.305 25.85 15 0.040 * 44.433 31 0.056
LPF 7.88 15 0.929 0.842 5.83 14 0.971 13.709 29 0.993
CIV 11.59 16 0.772 1.497 9.09 15 0.873 20.686 31 0.920
PTB 16.94 16 0.389 1.389 16.08 15 0.377 33.020 31 0.369
CLN 13.24 16 0.655 2.063 11.22 15 0.737 24.457 31 0.791
SOF 8.79 16 0.922 1.405 19.11 15 0.209 27.896 31 0.627
FRG 15.12 16 0.516 1.853 33.24 15 0.004 ** 48.361 31 0.024 *
TRS 11.98 16 0.745 1.880 13.64 15 0.553 25.621 31 0.739
OPT 24.61 16 0.077 2.088 15.08 15 0.446 39.685 31 0.136
ORB 10.06 14 0.758 2.488 14.85 13 0.317 24.910 27 0.579
CON 17.61 14 0.225 3.364 16.00 13 0.249 33.607 27 0.178
JAP 20.89 16 0.183 1.962 15.35 15 0.427 36.244 31 0.237
M3U 16.20 16 0.439 3.060 18.58 15 0.234 34.773 31 0.293
MEN 19.15 16 0.261 0.717 8.96 15 0.880 28.109 31 0.616
MHB 17.05 16 0.382 1.712 16.89 15 0.325 33.939 31 0.328
BUC 9.98 16 0.868 1.799 30.10 15 0.012 * 40.086 31 0.127
M3L 20.01 16 0.220 2.858 10.08 15 0.815 30.096 31 0.512
TRM 16.04 16 0.450 3.892 8.97 15 0.879 25.004 31 0.768
ATA 8.19 7 0.316 1.691 4.03 6 0.672 12.224 13 0.509
ATB 4.01 7 0.778 1.930 1.48 6 0.961 5.496 13 0.963
Total 471.52 473 0.511 465.01 442 0.217 936.53 915 0.303
*  P < 0.05,

**  P < 0.01.

Constant sex difference across populations

Table 5 shows the results for the test of homogeneity of sex difference (H3) and that of the null hypothesis of the assumed constant sex difference (H4) under the hypothesis of constant variance of IICL across populations (H1 and H2). Three traits (TYM, CLN, and MEN) exhibited a heterogeneity of sex difference that was statistically significant at the 5% level, but no trait satisfied even q < 0.5 to control the FDR. The sum of chi-square values indicated that H3 was acceptable when tested collectively over traits (P = 0.066).

Table 5 Test of hypotheses about sex difference in the mean of liability: homogeneity of difference across populations (H3) and no common difference (H4)
N of popul. Pooled frequency Homogeneity of sex difference H3 | H1 & H2 (di = d) Common sex difference (d) No common sex difference H4 | H1 & H2 & H3 (d = 0)
Male Female χ2 d.f. P d [95% CI] χ2 P
OMB 16 0.163 0.158 7.05 15 0.956 0.078 [−0.029 0.185] 1.97 0.160
AST 16 0.206 0.146 19.78 15 0.181 0.387 [0.285 0.490] 57.45 3E-14 ***
PNB 16 0.226 0.197 13.59 15 0.557 0.173 [0.081 0.265] 13.48 2E-04 ***
POS 16 0.218 0.200 14.92 15 0.457 0.028 [−0.046 0.081] 0.59 0.441
HYP 16 0.167 0.176 15.17 15 0.439 −0.023 [−0.094 0.047] 0.32 0.574
PCP 16 0.239 0.171 5.63 15 0.985 0.582 [0.416 0.751] 44.42 3E-11 ***
ICC 16 0.291 0.280 21.25 15 0.129 0.053 [−0.024 0.130] 1.55 0.214
SQS 16 0.048 0.034 11.32 15 0.729 0.269 [0.104 0.439] 10.56 0.001 **
SPS 16 0.017 0.005 12.24 15 0.661 0.892 [0.545 1.315] 29.27 6E-08 ***
MAR 16 0.158 0.159 18.27 15 0.249 −0.016 [−0.151 0.119] 0.10 0.749
TYM 16 0.251 0.343 28.29 15 0.020 * −0.707 [−0.876 −0.544] 70.15 5E-17 ***
FSP 16 0.174 0.232 23.41 15 0.076 −0.341 [−0.444 −0.239] 45.30 2E-11 ***
LPF 15 0.100 0.076 10.47 14 0.727 0.197 [0.088 0.309] 13.39 3E-04 ***
CIV 16 0.063 0.038 11.43 15 0.721 0.485 [0.323 0.655] 36.21 2E-09 ***
PTB 16 0.189 0.131 17.75 15 0.276 0.411 [0.298 0.525] 52.68 4E-13 ***
CLN 16 0.136 0.115 27.59 15 0.024 * 0.289 [0.119 0.461] 10.52 0.001 **
SOF 16 0.445 0.503 24.30 15 0.060 −0.213 [−0.306 −0.120] 20.32 7E-06 ***
FRG 16 0.256 0.343 19.32 15 0.200 −0.550 [−0.684 −0.419] 69.82 7E-17 ***
TRS 16 0.060 0.050 15.85 15 0.392 0.105 [−0.086 0.299] 1.22 0.270
OPT 16 0.037 0.040 16.92 15 0.324 −0.128 [−0.383 0.117] 0.93 0.335
ORB 14 0.236 0.137 8.36 13 0.819 0.836 [0.588 1.097] 46.34 1E-11 ***
CON 14 0.300 0.300 15.25 13 0.292 −0.046 [−0.374 0.280] 0.07 0.785
JAP 16 0.205 0.248 11.13 15 0.743 −0.369 [−0.532 −0.210] 21.88 3E-06 ***
M3U 16 0.164 0.171 15.65 15 0.406 −0.366 [−0.628 −0.111] 7.31 0.007 **
MEN 16 0.121 0.089 25.20 15 0.047 * 0.213 [0.123 0.304] 22.24 2E-06 ***
MHB 16 0.138 0.122 11.97 15 0.681 0.179 [0.022 0.338] 5.37 0.021 *
BUC 16 0.066 0.060 18.06 15 0.260 0.134 [−0.064 0.331] 1.57 0.210
M3L 16 0.185 0.192 22.29 15 0.101 −0.155 [−0.402 0.092] 1.54 0.215
TRM 16 0.149 0.135 6.69 15 0.966 0.376 [−0.025 0.791] 3.46 0.063
ATA 7 0.100 0.062 10.93 6 0.090 0.419 [0.011 0.862] 4.04 0.044 *
ATB 7 0.080 0.059 7.59 6 0.269 0.334 [−0.161 0.868] 1.76 0.184
Total 487.67 442 0.066 595.84 1E-105 ***
*  P < 0.05,

**  P < 0.01,

***  P < 0.001.

Statistically significant sex difference was exhibited by 19 traits (TYM, FRG, AST, PTB, ORB, FSP, PCP, CIV, SPS, MEN, JAP, SOF, PNB, LPF, SQS, CLN, M3U, MHB, and ATA) at the 5% level, of which the first 14 cases were significant at the 0.1% level, and the next three cases were significant at the 1% level. All these traits, apart from ATA, satisfied the Benjamini–Hochberg criterion q < 0.05 to control the FDR. The absolute magnitude of sex difference was below 1 in all traits, and the 95% CI was within the range from −1 to 1, except for two traits (SPS and ORB). This means that the difference in the mean of liability between sexes was less than the inter-side variability for all traits, although not necessarily negligible as shown later.

Relative magnitude of sex difference

Table 6 shows the ratios of magnitude of sex difference to the within-population standard deviation of liability (|d|/√(V + 1); S/W ratio) and to the WRMS of group difference (|d|/Dg; S/G ratio).

Table 6 Relative magnitude of sex difference to within-population SD (√(V + 1)) of liability and the WRMS of group difference (Dg) (see text for definition of types)
Trait V d S/W ratio |d|/√(V + 1) Dg S/G ratio |d|/Dg χ2 P Type
OMB 1.198 0.078 5.0% 0.739 10.5% 1.97 0.160 0
AST 1.226 0.387 24.4% 0.366 105.7% 57.45 3E-14 *** 2
PNB 1.181 0.173 11.2% 0.336 51.4% 13.48 2E-04 *** 1
POS 0.790 0.028 2.2% 0.563 4.9% 0.59 0.441 0
HYP 0.714 −0.023 1.9% 0.396 5.9% 0.32 0.574 0
PCP 2.143 0.582 24.6% 0.951 61.2% 44.42 3E-11 *** 2
ICC 0.909 0.053 3.9% 0.315 16.8% 1.55 0.214 0
SQS 1.415 0.269 15.5% 0.904 29.8% 10.56 0.001 ** 1
SPS 1.956 0.892 40.6% 0.695 128.4% 29.27 6E-08 *** 2
MAR 1.773 −0.016 0.8% 0.966 1.6% 0.10 0.749 0
TYM 2.516 −0.707 26.1% 1.011 70.0% 70.15 5E-17 *** 2
FSP 1.311 −0.341 20.7% 0.452 75.5% 45.30 2E-11 *** 2
LPF 0.847 0.197 15.0% 0.377 52.3% 13.39 3E-04 *** 1
CIV 1.502 0.485 26.9% 0.701 69.2% 36.21 2E-09 *** 2
PTB 1.394 0.411 23.9% 0.866 47.4% 52.68 4E-13 *** 2
CLN 2.070 0.289 12.6% 0.988 29.2% 10.52 0.001 ** 1
SOF 1.410 −0.213 12.3% 1.093 19.5% 20.32 7E-06 *** 1
FRG 1.859 −0.550 26.1% 0.865 63.6% 69.82 7E-17 *** 2
TRS 1.885 0.105 4.9% 1.126 9.3% 1.22 0.270 0
OPT 2.093 −0.128 5.5% 0.573 22.3% 0.93 0.335 0
ORB 2.491 0.836 31.1% 1.267 66.0% 46.34 1E-11 *** 2
CON 3.406 −0.046 1.3% 1.792 2.5% 0.07 0.785 0
JAP 1.965 −0.369 16.7% 1.739 21.2% 21.88 3E-06 *** 1
M3U 3.072 −0.366 11.3% 2.235 16.4% 7.31 0.007 ** 1
MEN 0.724 0.213 17.2% 0.322 66.1% 22.24 2E-06 *** 1
MHB 1.718 0.179 9.0% 1.043 17.2% 5.37 0.021 * 1
BUC 1.803 0.134 6.5% 0.486 27.6% 1.57 0.210 0
M3L 2.872 −0.155 5.1% 1.429 10.8% 1.54 0.215 0
TRM 3.904 0.376 9.3% 3.774 10.0% 3.46 0.063 0
ATA 1.714 0.419 21.1% 0.498 84.0% 4.04 0.044 * 2
ATB 1.954 0.334 15.2% 1.768 18.9% 1.76 0.184 0
*  P < 0.05,

**  P < 0.01,

***  P < 0.001.

The traits can be classified into three types based on the statistical significance and S/W ratio of the assumed constant sex difference. Type 0 includes 12 traits (MAR, CON, HYP, POS, ICC, TRS, OMB, M3L, OPT, BUC, TRM, and ATB) with no statistical significance and very low S/W ratios (1–9%, excepting 15% for ATB). Type 1 includes 9 traits (M3U, MHB, SOF, JAP, CLN, SQS, PNB, LPF, and MEN) with statistical significance at 5% but rather low S/W ratios (9–17%). Type 2 includes 10 traits (SPS, ORB, CIV, TYM, FRG, PCP, AST, PTB, ATA, and FSP) with statistically significant sex difference (P < 0.001, excepting P = 0.044 for ATA) and high S/W ratios (over 20%). The S/G ratio was 1.6–27.6% for type 0, 16.4–66.1% for type 1, and 47.4–128.4% for type 2.

Non-parametric test of sex difference

Table 7 shows the results of the CMH tests of sex difference. The result of the Breslow–Day test of homogeneity of OR indicates that the deviation from the homogeneity of sex difference (Ri = R) was statistically significant in only five traits (TYM at the 1% level, and CLN, OPT, MEN, and M3L at the 5% level), none of which satisfied q < 0.05 to control the FDR. When tested collectively over traits, however, the deviation was statistically significant (P = 0.0013). The Pearson’s correlation coefficient of log-transformed probability with that of the DLM–MLE method was 0.878.

Table 7 Non-parametric test of sex difference
Trait Homogeneity of sex difference (Ri = R)1 Commmon sex difference (common R)2 No common sex difference (R = 1)3
χ2 d.f. P log(OR) [95% CI] χ2 P
OMB 8.96 15 0.880 0.132 [0.003 0.262] 3.88 0.049 *
AST 21.23 15 0.130 0.452 [0.332 0.573] 54.33 2E-13 ***
PNB 12.82 15 0.616 0.182 [0.071 0.293] 10.16 0.001 **
POS 17.33 15 0.299 0.007 [−0.105 0.119] 0.01 0.927
HYP 14.15 15 0.514 −0.024 [−0.137 0.089] 0.15 0.697
PCP 8.35 15 0.909 0.416 [0.285 0.546] 38.79 5E-10 ***
ICC 24.43 15 0.058 0.069 [−0.040 0.177] 1.47 0.225
SQS 14.50 15 0.488 0.311 [0.102 0.521] 8.19 0.004 **
SPS 17.01 13 0.199 1.265 [0.809 1.720] 32.23 1E-08 ***
MAR 15.09 15 0.445 −0.039 [−0.162 0.085] 0.34 0.562
TYM 30.98 15 0.009 ** −0.423 [−0.527 −0.318] 62.72 2E-15 ***
FSP 21.26 15 0.129 −0.387 [−0.501 −0.274] 44.42 3E-11 ***
LPF 11.32 14 0.661 0.292 [0.120 0.465] 10.77 0.001 **
CIV 11.37 13 0.579 0.575 [0.382 0.769] 33.94 6E-09 ***
PTB 19.15 15 0.207 0.448 [0.324 0.571] 50.32 1E-12 ***
CLN 27.10 15 0.028 * 0.238 [0.098 0.377] 11.06 9E-04 ***
SOF 24.29 15 0.060 −0.206 [−0.311 −0.102] 14.73 1E-04 ***
FRG 20.07 15 0.169 −0.421 [−0.531 −0.311] 55.90 8E-14 ***
TRS 17.95 15 0.265 0.124 [−0.063 0.311] 1.57 0.211
OPT 22.65 13 0.046 * −0.095 [−0.333 0.143] 0.52 0.473
ORB 12.46 13 0.491 0.544 [0.378 0.711] 41.08 1E-10 ***
CON 15.07 14 0.373 −0.036 [−0.194 0.122] 0.17 0.682
JAP 10.64 15 0.778 −0.301 [−0.434 −0.168] 19.53 1E-05 ***
M3U 16.46 14 0.286 −0.209 [−0.355 −0.063] 7.73 0.005 **
MEN 26.32 15 0.035 * 0.372 [0.224 0.520] 24.09 9E-07 ***
MHB 12.61 15 0.633 0.177 [0.026 0.327] 5.11 0.024 *
BUC 22.56 15 0.094 0.183 [−0.015 0.381] 3.13 0.077
M3L 25.97 15 0.038 * −0.091 [−0.236 0.055] 1.41 0.234
TRM 5.60 12 0.935 0.191 [0.008 0.375] 4.00 0.046 *
ATA 9.14 6 0.166 0.440 [0.021 0.860] 3.86 0.049 *
ATB 9.15 4 0.057 0.275 [−0.186 0.736] 1.13 0.288
Total 525.98 431 0.001 ** 546.73 1E-95 ***
1  Breslow–Day test.

2  Cochran-Mantel-Haenszel estimate of common OR.

3  Cochran-Mantel-Haenszel test of conditional independence.

*  P < 0.05,

**  P < 0.01,

***  P < 0.001.

The null hypothesis for the assumed constant sex difference (R = 1) was rejected in 21 traits (TYM, FRG, AST, PTB, FSP, ORB, PCP, CIV, SPS, MEN, JAP, SOF, CLN, LPF PNB, SQS, M3U, MHB, TRM, OMB, and ATA) at the 5% level, of which the first 13 cases were significant at the 0.1% level, and the next four cases were significant at the 1% level. The first 18 of them satisfied q < 0.05 to control the FDR. These are the same traits that satisfied the same criterion in the analysis of the DLM–MLE method. The Pearson’s correlation coefficient of log-transformed probability between the methods was very high (r = 0.993).

Discussion

Homogeneity of variance of liability

The hypothesis of equality of variance of liability between sexes proved to be acceptable. This seems natural considering that the variability of genetic and environmental factors affecting the liability should be identical between sexes. The hypothesis of homogeneity of variance of liability across populations was substantially acceptable. This suggests that most of the factors affecting the liability are common to all populations. These results justify the comparison of the means of liability between sexes and populations in the subsequent analyses.

Homogeneity of sex difference in the mean of liability

The results based on the world-wide 16 populations indicate that the sex difference in the mean of liability is basically constant across populations for each trait. This is in accordance with the assumption of the multifactorial threshold model that the factors affect the liability additively even though the magnitude of sex difference may be much greater than these factors. Perizonius (1979) raised a question about whether the same traits show sex differences in various populations. This study has given an affirmative answer to this question, and, moreover, showed that both the direction and magnitude of the differences should be assumed to be constant.

The result here disagrees with the conclusion of Berry (1975), who emphasized inconsistency in the direction and magnitude of sex difference among populations. However, the result she presented appears to be generally in accordance with the hypothesis of constant sex difference. In her result, the direction of difference was consistent in five of six traits where multiple samples exhibited statistically significant sex differences. The only exception was the highest nuchal line, where one of three samples exhibited difference at the 5% level in the opposite direction to that of other two. Therefore, the conclusion did not faithfully reflect the result of her analysis. This was probably because she preferred the hypothesis of random sex difference, which could explain the failure of Berry and Berry (1967) to detect significant sex differences when they pooled the same samples.

Magnitude of sex difference

In the 12 traits classified as type 0, the constant sex difference was statistically not significant at the 5% level, and the nine traits classified as type 1 exhibited statistically significant but fairly small sex differences. The magnitude of sex difference for these 21 traits seems to be enough low to regard sex as one of the numerous factors affecting the liability, hence these traits may be safely used for population comparisons without distinguishing sex. For the remaining 10 traits classified as type 2 because of their statistically significant higher magnitudes of sex difference, it is not justifiable to regard sex as one of the numerous factors contributing to normal distribution of liability. A limited extent of uneven sex composition of samples may not so much affect the results of population comparisons because the assumed constant sex difference was less than the average group difference except for two traits. However, since the estimates of average group difference were based on the world-wide grouping, the S/G ratios obtained here should be considered as minimum values. In a comparison of closer populations, the effect of uneven sex composition can become innegligible. Mouri (1988), who compared the cranial nonmetric data of Jomon populations in West Japan using the mean character difference as the distance measure, reported the distances between sexes as comparable to those between populations.

A population comparison without distinguishing sex is justifiable if it uses the traits with a sufficiently low magnitude of sex difference in comparison with differences between populations. Since sex difference of liability proved to be basically constant across populations, the statistical significance and relative magnitude of the assumed constant sex difference in Table 6 would provide references for selection of such traits. Since the S/G ratio depends on the populations included in the comparison, the traits most appropriate for a comparison may vary depending on the level of comparison and the combination of populations.

Comparison of results with non-parametric tests

The results of the tests of sex difference were similar between the two methods despite the difference in their definitions of sex difference. This is not surprising because the numerical effect caused by the difference in the definition of sex difference tends to be small when the trait frequency varies within a limited range. It is notable, however, that the homogeneity of sex difference in the mean of liability exhibited a higher goodness of fit than the homogeneity of sex difference in the form of male-to-female OR of trait occurrence. This indicates that the multifactorial threshold model is more appropriate than the genotype model for explaining the expression of nonmetric traits. The genotype model seems to be the only logically consistent model that justifies the use of individual counts for recording the expression of bilateral nonmetric traits. The ASUDA system in dental anthropology has adopted this model for justification of the individual counts (Turner et al., 1991).

Ossenberg (1981) adopted the threshold model with a single liability and two thresholds proposed by Falconer (1960) to argue for the use of side counts. Although the strangeness of the two thresholds invited criticisms (McGrath et al., 1984; Hallgrímsson et al., 2005), and the conventional count method has been predominantly used for some reason, the use of side counts she advocated is a mathematical requirement under the DLM (Tagaya, 2019). Thus, the results in this study indicate that the world-wide data of nonmetric traits Ossenberg collected over her lifetime support the use of side counts she proposed early in her academic life.

Acknowledgments

My deep gratitude is due to the late Dr. Nancy Suzanne Ossenberg who made her valuable data open to our academic community.

Appendix 1 Examples of testing hypotheses by the MLE method

Table S1 illustrates the MLE procedure for testing the hypotheses about the distribution of liability of DLM accommodating side difference. Only two populations are used here, but the number of populations will be expressed as k for extension to a general case. The programs required for calculations are provided in Appendix 2.

Table S1 Illustration of DLM–MLE procedures using the data for PCP (without Anscombe-like modification)
Hypothesis (constraint on DLM parameters) Popul. Sex MLE estimates Test of hypothesis
Parameters Expected counts Total Conditional
V Mij δij a b c d Total χ2 d.f.* P χ2 d.f.* P
(A) DLM with no constraint Europe M 1.807 −2.009 0.097 229 24 17 29 299 0.00 (Complete fit)
F 2.835 −3.585 0.143 181 10 6 17 214
i: population Japan M 2.289 −1.257 0.039 221 31 27 82 361
j: sex F 2.137 −1.911 0.099 150 17 12 29 208
(B) H1 (Vij = Vi) Europe M 2.100 −2.245 0.106 230.5 21.8 15.0 31.7 299 2.22 2 0.329 2.22 2 0.329
F 2.100 −2.813 0.117 179.7 12.1 8.0 14.3 214 (= k) (= k)
 Equal variance between sexes Japan M 2.239 −1.235 0.039 220.5 31.5 27.5 81.4 361
F 2.239 −1.981 0.102 150.4 16.5 11.5 29.6 208
(C) H2 (Vi = V) and H1 Europe M 2.185 −2.313 0.109 230.9 21.2 14.4 32.4 299 2.35 3 0.504 0.12 1 0.726
F 2.185 −2.900 0.121 179.9 11.8 7.7 14.6 214 (= 2k − 1) (= k − 1)
 Constant variance across populations Japan M 2.185 −1.212 0.038 220.0 32.2 28.1 80.7 361
F 2.185 −1.944 0.100 150.2 16.7 11.8 29.3 208
(D) H3 (di = d) and H1 and H2 Europe M 2.186 −2.283 0.108 229.7 21.5 14.6 33.2 299 2.48 4 0.648 0.14 1 0.713
F 2.186 −2.954 0.122 181.0 11.5 7.4 14.0 214 (= 3k − 2) (= k − 1)
 Constant sex difference across populations (d = 0.657) Japan M 2.186 −1.233 0.038 221.3 32.0 28.0 79.7 361
F 2.186 −1.904 0.100 149.0 17.0 11.9 30.1 208
(E) H4 (d = 0) and H1 and H2 and H3 Europe M 2.207 −2.558 0.114 239.3 19.3 12.9 27.5 299 14.49 5 0.013 12.01 1 0.001
F 2.207 −2.558 0.112 171.3 13.8 9.2 19.7 214 (= 3k − 1) (= 1)
 No sex difference Japan M 2.207 −1.477 0.039 235.0 30.1 26.2 69.7 361
F 2.207 −1.477 0.094 135.2 19.0 13.7 40.1 208
*  The formula in parenthesis gives the degrees of freedom of the chi-square value when k populations are included in the analysis.

(A) To accomplish the complete fit of the DLM to the observed data, three parameters (V, M, and δ) are required for each sex of each population. Therefore, 6k (= 12) parameters (i.e. Vij, Mij, and δij) are used in total.

(B) Hypothesis H1 assumes that the variance of liability is equal between sexes (i.e. Vij = Vi), hence the number of independent parameters (i.e. Vi, Mij, and δij) becomes 5k, and the decrease of the number of independent parameters, 6k − 5k = k (= 2) is the degrees of freedom (d.f.) of the chi-square statistic.

(C) Hypothesis H2 assumes that the common-to-sex variance of liability is constant across populations (i.e. Vi = V). Under H1 and H2 (i.e. Vij = V), the number of independent parameters (i.e. V, Mij, and δij) becomes 4k + 1, hence d.f. of the chisquare statistic is 6k − (4k + 1) = 2k − 1 (= 3). To test H2 under H1, the increments of the chi-square value and d.f. from those under H1 are used as the chi-square statistic and its d.f.

(D) Hypothesis H3 assumes that the difference in the mean of liability between sexes is constant across populations (i.e. di = d). Under H1 and H2 and H3 (i.e. Vij = V and di = d), the number of independent parameters (i.e. V, d, Mi, and δij) becomes 3k + 2, hence d.f. of the chi-square statistic is 6k − (3k + 2) = 3k − 2 (= 4). To test H3 under H1 and H2, the increments of the chi-square value and d.f. from those under H1 and H2 are used as the chi-square statistic and its d.f.

(E) Hypothesis H4 assumes that the difference in the mean of liability between sexes assumed to be constant across populations is equal to 0 (i.e. d = 0). Under H1 and H2 and H3 and H4 (i.e. Vij = V and di = 0), the number of independent parameters (i.e. V, Mi, and δij) becomes 3k − 1, hence d.f. of the chi-square statistic is 6k − (3k + 1) = 3k − 1 (= 5). To test H4 under H1 and H2 and H3, the increments of chi-square value and d.f. from those under H1 and H2 and H3 are used as the chi-square statistic and its d.f.

Appendix 2 VBA programs and usage instructions

The VBA functions provided here return the results in an array. To use an array function, select a range of cells, enter any formula including the function, e.g. “= DLM_Q (A3, B3, C3)*(D3 + 0.75)” in the first cell of the range, and press the Enter key while pressing the Shift and Control keys. The example above calculates the expected count quartet for sample size plus 3/4 if the values of V, M, and δ are stored in A3, B3, and C3, respectively, and the sample size is stored in D3. In the following descriptions, the DLM parameters V, M, and δ are represented by V, Mean, and Dev, respectively, in the VBA program codes.

1. Function DLM_Q (V, M, δ)

This function calculates the probability quartet under DLM (V, M, δ) and returns it in an array. The quartet probability densities for IICL value x calculated by DLM_Q_Dist(x, V, M, δ) are integrated from M − 6√V to M + 6√V using Simpson’s quadrature rule.

2. Function DLM_Q_Dist (x, V, M, δ)

This calculates the quartet of probability densities for IICL value x under DLM (V, M, δ). For a given value of x, the conditional probabilities of trait occurrence on the left and right sides are calculated as PL = h(x + δ) and PR = h(xδ) using the cumulative standard normal distribution function h. Since the left- and right-side liabilities are mutually independent when x is fixed, the quartet probability densities for x are obtained as {QLQRf(x), PLQRf(x), QLPRf(x), PLPRf(x)}, where QL = 1 − PL, QR = 1 − PR, and f(x) is the probability density of x distributed as N(M, V).

3. Function Chisq_DLM_Q (X, V, M, δ, ID)

This function calculates the chi-square statistic to test the goodness of fit of DLM(V, M, δ) to observed count data stored in the range X, and returns the chi-square value and the given or estimated values of M and δ in an array. Both or one of the M and δ can be left blank to use the estimates based on the data stored in X and V. The parameter values for complete fit DLM are obtained using the Solver. Find the value of V that minimizes the function value of Chisq_DLM_Q with both M and δ left blank. The value becomes equal to zero when the complete fit is achieved.

4. Iterative application of Solver for MLE solution

To obtain the MLE solution, it is often necessary to iteratively apply the Solver to each parameter (as the ‘changing variable’) in turn until the chi-square statistic converges to its minimum value. This can be automatized using a program created by the VBA Macro Recorder, with some modification of codes and inclusion of the Solver in the references. The use of the initial estimates obtained by the blank-estimation method above enables rapid conversion.

Each border of 95% CI of a parameter is obtained as the value that increases the minimum chi-square value by 3.84146 (the chi-square value with one degree of freedom for P = 0.05). This requires iterative application of Solver to find values of other parameters minimizing the chi-square value. Repeat adjustment of the border value and minimization of the chisquare value (by changing other parameters) alternately until the statistic converges to the expected value.

‘ Programs created by A.Tagaya for analyses based on the DLM

‘ DLM_Q, DLM_Q_Dist, and Chisq_DLM_Q return the results in an array.

Function DLM_Q(V, Mean, Dev, Optional k = 10) As Variant

‘Calculates the quartet of probabilities {P00,P01,P10,P11} for DLM(V,Mean,Dev)

‘Uses DLM_Q_Dist and Simpson

‘Dimension of F0 to F3 must be equal to N (=k*Nsd*2) or greater.

Dim F0(960), F1(960), F2(960), F3(960), F(4)

SD = Sqr(V)

Nsd = 6 ‘Set the interval for quadrature [Mean-Nsd*SD to

Mean+Nsd*SD]

N = k * Nsd * 2 ‘total number of sections

dx = SD / k ‘width of increment

x0 = Mean - Nsd * SD ‘lower border of the interval for quadrature

For i = 0 To N ‘Calculation for function values of x over the

quadrature interval

X = x0 + i * dx

FF = DLM_Q_Dist(X, V, Mean, Dev) ‘Calculate prob. densities for x

F0(i) = FF(0, 0) ‘Set i-th prob. density in {F0(i), F1(i), F2(i), F2(i)}

F1(i) = FF(1, 0)

F2(i) = FF(2, 0)

F3(i) = FF(3, 0)

Next i

F(0) = Simpson(F0, N, dx) ‘Calculate prob. quartet {F(0), F(1), F(2), F(3)}

F(1) = Simpson(F1, N, dx)

F(2) = Simpson(F2, N, dx)

F(3) = Simpson(F3, N, dx)

TF = F(0) + F(1) + F(2) + F(3) ‘adjustment to make total prob=1

For i = 0 To 3

FF(i, 0) = F(i) / TF ‘Set the results for use as a vertical array

FF(0, i) = F(i) / TF ‘Set the results for use as a horizontal array

Next i

DLM_Q = FF

End Function

Function DLM_Q_Dist(X, V, Mean, Dev) As Variant

‘Calculates quartet probab. densities for a given IICL value X under DLM(V,

Mean, Dev).

‘Returns the values in an array.

Dim F(4), FF(4, 4) As Variant

Px = Application.WorksheetFunction.Norm_Dist(X, Mean, Sqr(V), 0) ‘Prob. density

of X

‘For given X, the total liability is distributed with var=1 and mean=X+Dev (or

X-Dev).

‘Hence the probability for the total liability to exceed 0 (=threshold) is

given as follows.

PL = Application.WorksheetFunction.Norm_Dist(X+Dev, 0, 1, 1)

PR = Application.WorksheetFunction.Norm_Dist(X-Dev, 0, 1, 1)

QL = 1 - PL ‘Conditional probability of negative expression on the

left side

QR = 1 - PR ‘Conditional probability of negative expression on the

right side

F(0) = QL * QR * Px ‘ {F(0), F(1), F(2), F(3)} is the quartet of prob.

densities

F(1) = PL * QR * Px

F(2) = QL * PR * Px

F(3) = PL * PR * Px

For i = 0 To 3

FF(i, 0) = F(i) ‘Set the results for use as a vertical array

FF(0, i) = F(i) ‘Set the results for use as a horizontal array

Next i

DLM_Q_Dist = FF

End Function

Function Simpson(F, N, dx)

‘Integration by Simpson’s quadrature rule.

‘N must be an even number.

‘F is an array containing N+1 function values in F(0) to F(N).

If Int(N / 2) * 2 < N Then Simpson = “N is not even”: Exit Function

S = F(0) - F(N)

For i = 1 To N - 1 Step 2

S = S + 4 * F(i) + 2 * F(i + 1)

Next i

Simpson = S * dx / 3

End Function

Function Chisq_DLM_Q(X, V, Optional Mean = ““, Optional Dev = ““, Optional ID = 1)

As Variant

‘Chi-square statistic based on the likelihood of DLM(V, Mean, Dev) for count

data X

‘Mean and/or Dev are estimated from X and V if one or both of them are left

blank.

‘Set ID=0 to avoid modification of data X by adding dN={1/8, 1/4, 1/4, 1/8}.

Dim XX(4), F(4), Chi2(3, 3)

dN = Array(0.125, 0.25, 0.25, 0.125)

T = 0

For i = 0 To 3

A = X.Cells(i + 1) + dN(i) * ID ‘add {1/8, 1/4, 1/4, 1/8} to avoid zero

frequency

T = T + A

XX(i) = A

Next i

PL = (XX(1) + XX(3)) / T ‘side-count frequency for left side

PR = (XX(2) + XX(3)) / T ‘side-count frequency for right side

If Mean = ““ Or dev = ““ Then

MeanL = Application.WorksheetFunction.Norm_S_Inv(PL) * Sqr(V + 1)

MeanR = Application.WorksheetFunction.Norm_S_Inv(PR) * Sqr(V + 1)

Mean0 = (MeanL + MeanR) / 2

Dev0 = MeanL - Mean0

End If

If Mean = ““ Then Mean = Mean0 ‘If Mean is blank, the estimate is used

If Dev = ““ Then Dev = Dev0 ‘If Dev is blank, the estimate is used

FF = DLM_Q(V, Mean, Dev) ‘Store the quartet probabilities in FF (two-

dimensional)

For i = 0 To 3

F(i) = T * FF(i, 0) ‘F(i) is the count expected for sample size T

(=N+0.75*ID)

Next i

LnLR = 0 ‘LnLR: log likelihood ratio

For i = 0 To 3

If XX(i) * F(i) > 0 Then LnLR = LnLR + XX(i) * Log(F(i) / XX(i))

Next i

Chi2(0,0)= -2 * LnLR

Chi2(0,1)= Mean: Chi2(1,0)=Mean

Chi2(0,2)= Dev: Chi2(2,0)=Dev

Chisq_DLM_Q = Chi2

End Function

References
 
© 2020 The Anthropological Society of Nippon
feedback
Top