Consider the detection of abnormality when both continuous and categorical variables are observed. Replacing categorical variables by their dummy variables, the squared Mahalanobis distance is sometimes used. Assuming that continuous variables are conditionally normally distributed when the categorical ones are observed, the exact distribution of the squared Mahalanobis distance is derived. When only one dichotomous variable exists, it is a mixture of two shifted χ² distributions. In the case when
k dichotomous variables are involved and their effects are additive on the mean vector of continuous variables, it is a mixture of
k shifted χ² distributions.
Using these results, a detection method based on Mahalanobis distance (modified Mahalanobis distance method) is constructed as a hypothesis testing procedure with exact significant level. For the case when only one dichotomous variable exists, some basic properties are shown on the conditional rejection probabilities in normal condition.
Further, a comparison is made with the conditional detection method which is based on the conditional distribution of continuous variables when the dichotomous variable is observed. From the numerical evaluation of the power functions of these two methods, the modified Mahalanobis distance method performs better than the conditional detection method for a wide range of parameter values, especially when, for the dichotomous variable, the probability of the case with smaller probability in normal condition increases in abnormal condition.
View full abstract