Anthropological Science
Online ISSN : 1348-8570
Print ISSN : 0918-7960
ISSN-L : 0918-7960
Original Articles
Tetrachoric correlation of bilateral nonmetric traits: a defect in the conventional procedure and a proposal for two alternative estimation methods
AKIRA TAGAYA
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML

2019 Volume 127 Issue 1 Pages 39-45

Details
Abstract

The individual-count method of bilateral nonmetric traits has been widely used despite its apparent defects in both theory and practice. Logically, its use means adopting the false concept of ‘two thresholds’ based on the single-liability model. This conceptual defect can create actual problems, including that of conventional ‘tetrachoric correlation.’ The correlation coefficient calculated by formally applying the tetrachoric procedure to the individual-count frequencies is mathematically meaningless because there exists no true liability and threshold that can explain such data. Moreover, it considerably underestimates the correlation if it is used as the estimate of the correlation between the individual-specific components of liability because it neglects the contribution of the inter-side component in the variance of total liability. Two statistical methods are proposed to estimate the correlation coefficient between inter-individual components of liability and its confidence interval. Some selected data from the database published by Ossenberg on the Internet were used to illustrate the utility of the new methods and to examine the problem of the conventional method. The method of estimation of the correlation between the inter-individual components of liability based on the combination of two dual-liability models provided, as a by-product, substantial support for the standard threshold model based on data. Because the conventional ‘tetrachoric correlations’ proved to seriously underestimate the correlations, the results of almost all studies using Mahalanobis distances based on nonmetric traits so far published may require re-evaluation. It is also argued that a fundamental problem exists in the use of the individual-count method itself. Adopting an incorrect method for maintaining comparability is a vicious cycle. It is necessary to emphasize improving the reliability of future studies based on true statistics rather than keeping the comparability between less reliable results based on the false concept of threshold.

Introduction

The tetrachoric estimate (Pearson, 1900) of Pearson’s product-moment correlation coefficient is widely used for calculation of the Mahalanobis distance based on nonmetric traits with a normal distribution of underlying liabilities proposed by Blangero (Konigsberg, 1990; Blangero and Williams-Blangero, 1991). Because it is calculated from the 2 × 2 frequencies of two dichotomous traits, the individual-count frequencies have been used for bilateral traits (Sutter and Verano, 2007; Schillaci et al., 2009; Irish, 2010; Nikita et al., 2012). However, there seem to be problems in the conceptual, biological, and statistical aspects of such a procedure.

Conceptually, the use of individual-count frequencies does not conform to the threshold model because there exists no real threshold for the presence/absence of individual-counts. As a threshold statistic, the individual-count frequency of a bilateral trait is based on a single-liability model (SLM) with two thresholds between the three types of expression—no occurrence, asymmetric occurrence, and symmetric occurrence—and for some reason uses only the lower threshold. However, the SLM cannot be a true threshold model when adopted for a bilateral trait. Construction of a threshold model for the occurrence of a bilateral trait must be founded on the fact that the occurrence of the trait on each side is a threshold character by itself. It is therefore apparent that there exist no thresholds between the three types of occurrence. As illustrated by McGrath et al. (1984), the probability of asymmetric occurrence is highest around the ‘fuzzy threshold’—a term coined by Hallgrímsson et al. (2005)—and gradually decreases with the distance from this point.

Although every threshold should be somewhat fuzzy when microscopically observed, the threshold model is based on the assumption that the width of such a fuzzy range is negligibly small. The fuzzy ranges between the three types of bilateral expression are far from negligible. Because their widths are, with mathematical inevitability, of the same level as the threshold interval, the threshold interval must be also neglected when these fuzzy ranges are neglected, and the ‘fuzzy threshold’ must be used as the single threshold of the SLM. In other words, the premise of the SLM rejects the existence of asymmetric occurrence when used for a bilateral trait. Thus, the SLM with two thresholds contains a conceptual inconsistency with self-contradiction when used for a bilateral trait; the threshold characteristic of the trait occurrence on each side does not allow the SLM to have its two thresholds.

As explained above, the individual-count data have no true liability and threshold to explain them, hence the value obtained by formally adopting the tetrachoric method for such data is not a real tetrachoric correlation. It has no entity to give any meaning to it. It is noteworthy that Konigsberg (1990) and Konigsberg et al. (1993) bypassed this problem by randomly selecting the side to be used for each bilateral trait within each cranium.

Biologically, the substantial correlations between liabilities of traits should have been underestimated because of the existence of the within-individual fluctuation represented by asymmetric expressions of the traits. The same phenomenon should exist also in non-lateral traits, although is not detectable without twin data. In this sense, bilateral traits should provide important information about this phenomenon. The magnitude of within-individual fluctuation of non-lateral traits may be estimated to some extent from those of bilateral traits with characteristics similar to those of the non-lateral trait.

Statistically, since the deviation of the mean threshold value (or the mean liability if the threshold is used as the origin of the liability scale) due to within-individual fluctuation calculated from the data of a population approaches 0 with increasing sample size, the correlation coefficients used for Mahalanobis distances should be based on the inter-individual components rather than the total liabilities. Moreover, the conventional procedure for tetrachoric estimation does not conform to the genuine threshold model and neglects much information. The estimation must be based on the dual-liability model (DLM) where the liability of each trait consists of the inter-side and inter-individual components, and 3 × 3 frequency data (or 4 × 4 frequency data if side difference is considered) must be used to determine the best fit correlation coefficient and other parameters.

The estimate of the correlation by the conventional method will be hereinafter written as ‘tetrachoric correlation’ to distinguish it from the true tetrachoric estimate of the correlation. Although it is unclear what liabilities are presumed in calculation of the ‘tetrachoric correlation’, it would be reasonable to assume that they were the liabilities specific to the individual and free from the inter-side fluctuation. If so, they can be nothing but the inter-individual components if we accept the standard threshold model.

To correctly apply the tetrachoric method to the individual-specific liability, we must know the frequency of individuals with a liability over the threshold, but there is no way for us to know whether or not an individual’s liability is over the threshold value as McGrath et al. (1984) described. The individual-count frequency uses the lower one of the false ‘two threshold’ instead of the ‘fuzzy threshold.’ It is strange that the upper ‘threshold’ is never used for this purpose. This tradition may have originated in scholars’ desire to see gene-like characteristics in nonmetric traits.

As noted above, the fluctuation of occurrence is biologically an important and useful characteristic of bilateral traits. The conventional procedure changes the nature of the data by ignoring this characteristic to forcibly apply a method that is inappropriate for the data. This is not a true solution. Rather, the method must be changed to suit the nature of the data. Considering the proportion of asymmetric occurrences in nonmetric traits, it is apparent that the factor ignored by the conventional procedure must seriously distort the results. These problems may have been overlooked simply because no method has been known to numerically demonstrate it. This paper attempts to reveal the problems of the conventional method by using two new statistical methods for estimating the correlations between the inter-individual components of liability appropriate for bilateral nonmetric traits.

Materials and Methods

Dataset

The nonmetric trait data of Arctic populations that are found in the dataset published on the Internet by Ossenberg (2013a, b) were used to illustrate the proposed methods and to examine their utilities, as well as to show the problems with the conventional method. The data of the traits listed in Table 1 were used for the analysis. All the individuals were pooled without distinguishing sex and treated as a sample representing the Arctic people.

Table 1 Traits used for examination of correlations between traits
Code Trait
OMB occipitomastoid ossicle
AST asterionic ossicle
PNB parietal notch bone
POS posterior condylar canal absent
LPF foramen in lateral pterygoid plate
CIV pterygospinous bridge complete (foramen of Civinini)
CON infraorbital suture variant
FRG frontal groove(s)
SOF supraorbital foramen

The correlations were analyzed for all the pairs of the first four traits in Table 1, and for three selected pairs of the remaining five traits. Table 2a provides the complete data for respective pairs. The 3 × 3 frequencies in Table 2b were calculated from those in Table 2a.

Table 2a Frequencies for 4 × 4 expressions of the pairs of traits examined (The terms ‘left’ and ‘right’ indicate occurrence only on the left side and occurrence only on the right side)
Pair of traits Frequency
T1 none none none none left left left left right right right right both both both both Total
T1 T2 T2 none left right both none left right both none left right both none left right both
OMB AST 932 68 89 73 120 22 19 19 85 18 23 13 82 11 26 29 1629
PNB 727 139 141 147 90 30 22 37 88 11 22 17 76 20 16 35 1618
POS 822 139 113 62 133 25 13 5 97 21 10 7 109 19 10 5 1590
AST PNB 898 170 157 154 70 24 17 27 82 20 29 39 66 19 18 42 1832
POS 971 164 128 68 102 17 13 4 118 31 12 6 100 17 10 11 1772
PNB POS 818 144 100 58 167 34 19 14 171 23 24 6 187 37 26 12 1840
LPF CIV 945 38 39 25 56 7 3 6 60 2 7 4 9 4 4 3 1212
CON 421 42 44 193 33 3 3 8 37 1 3 9 9 1 1 1 809
FRG SOF 357 154 135 440 21 11 8 82 13 8 12 69 29 20 21 184 1564
Table 2b Frequencies for 3 × 3 expressions of the pairs of traits examined (The two rows above the frequencies indicate the numbers of occurrences of traits T1 and T2)
Pair of traits Frequency
T1 0 0 0 1 1 1 2 2 2 Total
T1 T2 T2 0 1 2 0 1 2 0 1 2
OMB AST 932 157 73 205 82 32 82 37 29 1629
PNB 727 280 147 178 85 54 76 36 35 1618
POS 822 252 62 230 69 12 109 29 5 1590
AST PNB 898 327 154 152 90 66 66 37 42 1832
POS 971 292 68 220 73 10 100 27 11 1772
PNB POS 818 244 58 338 100 20 187 63 12 1840
LPF CIV 945 77 25 116 19 10 9 8 3 1212
CON 421 86 193 70 10 17 9 2 1 809
FRG SOF 357 289 440 34 39 151 29 41 184 1564

Direct method: estimation based on dual-liability model using 4 × 4 or 3 × 3 frequencies

The DLM with a pair of liabilities each consisting of normally distributed inter-individual and inter-side components was assumed for each of the two traits with its parameters being variable between the traits. The inter-individual component was assumed to be identical for both sides. Each inter-side component was assumed to be independent of other components and to have unit variance. Based on these assumptions, the parameters were determined so that the frequencies estimated from the model yield best fit to observed frequencies. The goodness of fit was evaluated using the chi-square value.

For the numerical solutions, the Microsoft Excel Visual Basic for Application (VBA) was used to make two function programs. The first program was coded to generate the four bilateral frequencies (none, left, right, and both) from three parameters: sample size (N), probability of trait occurrence (P), and population variance of the inter-individual component (V). The second program calculates the 4 × 4 frequencies from six parameters: sample size (N), probabilities of trait occurrence (P1 and P2), population variances of inter-individual components (V1 and V2) of respective traits, and the correlation coefficient between inter-individual components of liability (R). The population variance of the inter-side component was set to be 1 in both functions. The calculations were based on the univariate and bivariate normal distribution formulae. Correctness of the programs was confirmed by a set of normally distributed random data of 10000 individuals artificially generated using Microsoft Excel Worksheet. The sample size (N) and probabilities of trait occurrence (P1 and P2) were determined from the data, and the other parameters (V1, V2, and R) were estimated by the minimum chi-square method using the individuals without missing data for both traits. The V1 and V2 were estimated independently using the first function, and were used to estimate R in the second function. The 95% confidence interval (CI) of a parameter (R or V) was calculated using the chi-square statistic corresponding to the probability for the best estimate to be obtained by chance for the given population value of the parameter. The solution was obtained assuming other parameters of the model to be constant. Microsoft Excel Solver was used to find the values of these parameters that generate frequencies giving the minimum chi-square value. The estimates based on 3 × 3 frequencies will be shown in the Results section. The estimates based on the 4 × 4 frequencies were substantially the same as these.

An example of estimation of correlation coefficient using 3 × 3 frequencies

Table 3 exemplifies the method for estimation of the correlation between inter-individual components using the pair of OMB and AST as an example. The chi-square value is the sum of squared difference between observed and estimated frequencies divided by the latter for respective pairs of 3 × 3 frequencies. The value of R was estimated to be 0.443 because this produces the smallest chi-square value (5.11). The next row shows that the 3 × 3 frequencies estimated from the conventional ‘tetrachoric correlation’ (R = 0.330) are significantly different from those actually observed (χ2 = 9.77, df = 4, P = 0.045). The two rows at the bottom of Table 3 show that the 3 × 3 frequencies equal to those expected for the best estimate (R = 0.443) can be derived by chance at the probability of 5% when the true value of R is 0.277 or 0.597; hence the interval between 0.277 and 0.597 is expected to include the true value of R at the probability of 0.95 when its best estimate is 0.443.

Table 3 The 3 × 3 frequencies observed and estimated for different values of Pearson’s correlation coefficient R between inter-individual components of liability of OMB and AST (The SD estimates of the inter-individual component used for the calculation were 1.20 and 1.32, respectively)
Trait expression (number present in the individual) Fit index
OMB 0 0 0 1 1 1 2 2 2 Total χ2 (df = 4) P
AST 0 1 2 0 1 2 0 1 2
Observed 932 157 73 205 82 32 82 37 29 1629
Best fit R = 0.443 927.9 167.9 66.2 210.6 69.5 38.8 80.5 38.5 29.0 1629 5.11 0.276
Conventional R = 0.330 910.7 176.6 73.9 219.0 65.8 35.8 88.5 35.2 23.5 1629 9.77 0.045
95% CI R = 0.277 903.6 180.1 77.5 222.4 63.9 34.3 92.2 33.6 21.4 1629 9.491 0.050
R = 0.597 949.0 157.9 54.3 201.3 76.9 42.4 67.9 42.9 36.4 1629 9.491 0.050
1  Based on the difference from the frequencies corresponding to ‘best fit’.

Side-frequency method: corrected tetrachoric correlation based on side frequencies

We can use another method to estimate the correlation coefficient between inter-individual components. Let S1 and S2 be the standard deviations of inter-individual components of liability of two traits, and R be the correlation coefficient between these components. Because the inter-side component is assumed to be independent of all other components, the covariance of the total liability is S1S2R, and the variances of total liability are S12 + 1 and S22 + 1. Therefore, the correlation coefficient between the total liabilities will be obtained by multiplying R by S1S2/✓{(S12 + 1)(S22 + 1)}. This relationship can be used to estimate the value of R from the tetrachoric estimate of the correlation coefficient between the total liabilities estimated from the side frequencies.

Results

Bilateral threshold statistics of each trait

Table 4 shows the frequencies of the four types of bilateral expression and estimates of the standard deviation of the inter-individual component of liability of the nine traits. Although the two traits, OMB and POS, exhibited statistically significant side differences, no correction was made because the side difference does not seem to greatly affect the liability specific to individuals and the correlation between traits. The chi-square values evaluate the goodness of fit of the threshold model, but lower values only indicate that there exists no significant side difference in those traits; a complete fit is always assured if a side difference does not exist. The 95% CI of S for each trait was calculated by using the chi-square statistic (df = 2), evaluating the difference between the observed triad frequencies and those expected for a given value of S.

Table 4 Frequencies, trait rates (P), and estimates of the standard deviation (SD) of the inter-individual component of liability (The chi-square values for goodness of fit indicate the significance of side differences. The SD estimates differ a little from those in Table 6 because of missing values of paired traits.)
Trait N Frequency P SD [95% CI] Expected from P and SD χ2
none left right both none left right both
OMB 1646 1176 181 141 148 0.188 1.19 [1.03 1.37] 1175.2 161.8 161.8 147.2 4.96*
AST 1860 1396 142 174 148 0.165 1.30 [1.13 1.49] 1395.5 158.5 158.5 147.5 3.24
PNB 1941 1179 247 234 281 0.269 1.17 [1.03 1.31] 1178.9 240.6 240.6 280.9 0.35
POS 1988 1451 261 179 97 0.159 0.70 [0.56 0.84] 1449.3 221.7 221.7 95.3 15.22***
LPF 1222 1057 72 73 20 0.076 0.77 [0.53 1.03] 1057.0 72.5 72.5 20.0 0.01
CIV 2003 1788 81 77 57 0.068 1.51 [1.25 1.81] 1788.0 79.0 79.0 57.0 0.10
CON 1208 755 68 62 323 0.321 3.63 [3.06 4.31] 754.9 65.1 65.1 322.9 0.28
FRG 1576 1094 122 103 257 0.234 2.22 [1.93 2.55] 1093.7 112.8 112.8 256.7 1.60
SOF 2101 604 261 250 986 0.591 1.55 [1.40 1.72] 604.0 255.5 255.5 986.0 0.24
*  P < 0.05;

***  P < 0.001 (df = 1).

True and conventional tetrachoric estimates

Table 5 shows the tetrachoric correlations between total liabilities based on side frequencies and the conventional ‘tetrachoric correlations’ formally calculated from individual-count frequencies. The results are shown for all the six pairs of the first four traits and the selected three pairs for the remaining five traits. The tetrachoric correlations by side frequencies exhibit some variability among the four combinations. Each of the conventional ‘tetrachoric correlations’ fell within the range of variability of the tetrachoric correlations by side frequencies, but all of them were larger than the weighted means of the tetrachoric correlations by sides.

Table 5 Tetrachoric correlation between total liabilities of the pair of traits based on four combinations of side-count frequencies compared with the conventional ‘tetrachoric correlations’ formally calculated from individual-count frequencies (Li and Ri indicate the left and right sides of the trait i of the pair)
Pair Tetrachoric correlation between total liabilities Conventional estimate
L1–L2 R1–R2 L1–R2 R1–L2 Mean
trait 1 trait 2 ρ1) n R n R n R n R n2) R3) n R
OMB AST 0.20 *** 1745 0.26 1760 0.34 1686 0.28 1712 0.22 1725.3 0.28 1629 0.33
PNB 0.08 ** 1736 0.21 1758 0.09 1697 0.15 1714 0.04 1725.9 0.12 1618 0.13
POS −0.02 1697 −0.05 1711 −0.07 1685 −0.15 1707 0.04 1699.9 −0.06 1590 −0.04
AST PNB 0.15 *** 1926 0.23 1925 0.28 1878 0.20 1869 0.21 1899.1 0.23 1832 0.25
POS 0.00 1858 0.00 1853 −0.03 1847 −0.02 1843 0.07 1850.2 0.01 1772 0.01
PNB POS 0.00 1925 0.05 1924 −0.02 1916 −0.01 1915 −0.05 1920.0 −0.01 1840 0.00
LPF CIV 0.15 *** 1366 0.37 1356 0.31 1352 0.30 1346 0.23 1355.0 0.30 1212 0.33
CON 0.09 * 955 −0.12 941 −0.13 944 −0.06 926 −0.18 941.4 −0.12 809 −0.18
FRG SOF 0.20 *** 1668 0.35 1650 0.39 1664 0.35 1640 0.33 1655.4 0.36 1564 0.38
1)  Spearman’s coefficient of correlation based on individual presence/absence.

2)  harmonic mean.

3)  weighted mean.

*  P < 0.05;

**  P < 0.01;

***  P < 0.001.

Table 6 compares the results of the three methods for estimating the correlation between inter-individual components of liabilities for nine pairs of traits. The conventional method seriously underestimates the correlation coefficients to be used for calculation of the Mahalanobis distances when the correlations are substantial. The estimates by the side-frequency method (Rc) were almost identical to those obtained by the direct method using the 3 × 3 frequencies. The Rc based on the conventional ‘tetrachoric correlation’ was not included in the table because it is not only unjustifiable in theory but also unworkable in practice. For example, if conventional ‘tetrachoric correlation’ were used, the value of the Rc between OMB and AST should be 0.54, which apparently overestimates the correlation.

Table 6 Estimates of the correlation coefficient between inter-individual components of liability compared among three methods (The chisquare statistics for goodness of fit and 95% CI were calculated using the 3 × 3 frequencies in Table 2b)
Pair Tetrachoric estimate Direct method
Conventional Side-frequency method
trait 1 trait 2 n P1 P2 SD1 SD2 R χ24 n1) R Rc2) χ24 R χ24 [95% CI]
OMB AST 1629 0.189 0.167 1.20 1.32 0.33 9.77 * 1725.3 0.28 0.45 5.17 0.44 5.11 [0.28 0.60]
PNB 1618 0.189 0.270 1.20 1.17 0.13 3.33 1725.9 0.12 0.21 1.54 0.21 1.54 [0.04 0.38]
POS 1590 0.188 0.160 1.19 0.72 −0.04 1.43 1699.9 −0.06 −0.13 0.96 −0.10 0.83 [−0.33 0.14]
AST PNB 1832 0.163 0.267 1.31 1.16 0.25 9.18 * 1899.1 0.23 0.38 2.83 0.38 2.83 [0.22 0.52]
POS 1772 0.163 0.161 1.28 0.71 0.01 5.17 1850.2 0.01 0.02 5.09 0.04 5.01 [−0.18 0.26]
PNB POS 1840 0.267 0.160 1.15 0.70 0.00 1.18 1920.0 −0.01 −0.02 1.28 0.00 1.18 [−0.20 0.31]
LPF CIV 1212 0.076 0.074 0.76 1.48 0.33 12.76 ** 1355.0 0.30 0.60 4.72 0.61 4.71 [0.30 0.88]
CON 809 0.075 0.321 0.71 3.20 −0.18 2.06 941.4 −0.12 −0.22 1.49 −0.29 1.11 [−0.57 0.06]
FRG SOF 1564 0.234 0.613 2.21 1.58 0.38 11.03 * 1655.4 0.36 0.46 5.37 0.48 5.18 [0.35 0.59]
1)  Harmonic mean of four combinations.

2)  obtained by dividing R by SD1SD2/✓[(SD12 + 1)(SD22 + 1)].

*  P < 0.05;

**  P < 0.01 (one-tail probability).

Discussion

Appropriateness of the dual-liability model

The results showed that the respective sets of 3 × 3 frequencies could be explained by the combination of the dual-liability model for a pair of bilateral traits with intercorrelated inter-individual components and independent inter-side components of liability. This can be regarded as substantial support for the standard threshold model for occurrences of bilateral nonmetric traits. The threshold model has long been used to explain the occurrences of nonmetric traits, but its appropriateness has seldom been tested by data. There exists no way to test it using the frequencies of a single bilateral trait because the model allows only one freedom for testing side difference; the remaining freedoms are consumed by determining the model parameters. In the present model, four freedoms are left for substantially testing the goodness of fit of the threshold model based on the 3 × 3 frequencies without consuming a freedom to test the side difference.

Unreliability of the conventional ‘tetrachoric correlation’

The theoretical examination and the results of analyses indicate that the conventional ‘tetrachoric correlation’ is not only theoretically undependable but also considerably underestimates the correlations between the inter-individual components of liability of bilateral nonmetric traits, without exception when the correlation is substantial. Therefore, the results of studies based on Mahalanobis distances using the conventional ‘tetrachoric correlations’ have to be re-evaluated if they use the correlation matrix. It seems likely that almost all the studies using Mahalanobis distances so far reported employed this incorrect method. In particular, if the traits were selected for lower intercorrelation, the superficial correlation may have been lowered by smaller values of variances of inter-individual components. In the dataset of Ossenberg (2013a, b), however, most of the Spearman correlation coefficients were not statistically significant, hence the substantial effect of the use of the wrong method will be rather limited. The pairs of traits with significant correlations presented in this article were the exceptions in Ossenberg’s database.

Utility of mean tetrachoric correlation based on side frequencies

The values of Rc obtained from the weighted mean of the tetrachoric estimates of the four combinations of sides by correcting with the estimates of variances of inter-individual components were substantially identical to the correlation coefficients estimated directly from the 3 × 3 and 4 × 4 frequencies using the threshold model for a pair of bilateral nonmetric traits. This is natural because both the estimates conform to the threshold model. The Rc will be a better estimate when missing data significantly decrease the sample size for the 4 × 4 (and accordingly 3 × 3) frequencies.

The fairly large difference between uncorrected R and Rc indicates that the random selection of the side to be used for counting within each individual adopted by Konigsberg (1990) and Konigsberg et al. (1993) might have not solved the problem. These earlier approaches bypassed the conceptual problem included in the individual-count method, but might have nevertheless underestimated the correlation, probably more than the conventional method does. A benefit exists, however, with their conceptual correctness; the values of tetrachoric correlation they obtained can be corrected using estimates of the variance of inter-individual components of liability.

Fundamental problem in vicious cycle

The fundamental problem seems to be the use of the individual-count method for bilateral traits itself. As noted earlier in this article, the rate of occurrence of bilateral traits obtained by the individual-count method is not a true threshold statistic. The conceptual problem of the individual-count method was already pointed out by Ossenberg (1981), and it should be apparent to anyone that the individual-count method underestimates the frequency of unpaired materials. What has made researchers keep using this method?

In the early stage of anthropological studies of nonmetric traits, both the side-count and individual-count methods were used, and discussions on sampling procedures and statistical treatments existed (Green et al., 1979; Korey, 1980; Ossenberg, 1981; McGrath et al., 1984). It appears that comparability with preceding reports became the reason for selecting the individual-count method, and its theoretical premise was neglected as the number of reports increased. It is ironic that both McGrath et al. (1984) and Hallgrímsson et al. (2005), who refuted the use of ‘two thresholds’ for bilateral traits, justified the use of the individual-count method. This may be because Ossenberg (1981) used the ‘two thresholds’ in her argument against the individual-count method.

Acceptance of the illogical adoption of SLM and the individual-count method to bilateral nonmetric traits can be traced back to Falconer (1960). In his textbook of genetics, Falconer cited the variable number of mural lumbar vertebrae, where the number varied among 5.0, 5.5, and 6.0, and two thresholds were assumed between them. In reality, however, the phenomenon was the bilateral sacralization of the sixth lumbar vertebra. Thus, Falconer neglected the conceptual inconsistency in his usage of SLM, but the plausible interpretation of the phenomenon based on the formally calculated ‘two thresholds’ made it difficult for readers of the textbook to doubt its academic rigor. He maintained the same usage of SLM in his second edition, published in 1981. The results of the present study and confusion in the research history both testify that the conceptual consistency must be respected more seriously than usual when the effect of its violation cannot be directly observed.

To adopt the wrong method for comparability is a vicious cycle. It will be necessary to place importance on improving the reliability of future studies rather than keeping the comparability between less reliable results based on the false concept of threshold. The only reasonable solution may be to report the frequencies of all the four combinations of expressions of bilateral traits.

Acknowledgments

My deep gratitude is due to the late Dr. Nancy S. Ossenberg for allowing our academic community to use her valuable database by publishing it on the Internet. This study was not possible without it. I am also grateful to the reviewer who gave me valuable suggestions for improving the writing and including important references.

References
 
© 2019 The Anthropological Society of Nippon
feedback
Top