This paper proposes an integrated purchase model of household category purchase incidence, brand choice, and purchase quantity choice using the Gaussian copula. In contrast to the existing model, we assume the general form of the dependence parameter matrix for Gaussian copula. The proposed approach allows us to decompose the joint probability of the purchase outcomes into the conditional probability of one decision given the others. The price elasticities derived based on these conditional probabilities can fully reflect the underlying dependence among the decisions. The conditional probabilities are also utilized to predict future responses mimicking the sequence of the purchase decisions. The proposed model is applied to scanner panel data for the dishwashing soap category. We find that there exists a very strong positive dependence between incidence and brand choice, while dependence between incidence and quantity choice, and between brand choice and quantity choice is negative. The main sources of the overall behavioral response to price are found to be the incidence and brand choice decisions, while the quantity choice decision is hardly influenced by price change after decisions are made on the category of purchase and brand choice.
Using store purchase record (panel) data, we propose a model to analyze individual store preference by incorporating store location and the residential area of the customer. The proposed model has the following three characteristics. First, since the model estimates parameters for each customer, it is possible to observe customer heterogeneity. The model incorporates not only individual heterogeneity, but also the time trend. Second, the model enables trade area analysis through the use of prior structure. By assuming a hierarchical structure for the model, it is possible to estimate the store visit probabilities of each geo-demographic segment. Third, by incorporating a spatial lag model into the prior structure of geographic parameters, it is possible to complement missing data to supply information from neighboring regions. Therefore, the cost of market survey research is likely to reduce with the application of the proposed spatial lag model.
Optimal powers of the Gaussian and Jeffreys priors are obtained so that they minimize the asymptotic mean square error of the linear predictor and the sum of the asymptotic mean square errors of associated parameter estimators. Conditions that the summarized mean square errors using powers of the priors are smaller than those by maximum likelihood are given. In the case of a scalar canonical parameter in the exponential family, a matching prior for the Jeffreys power prior is found, where the Wald confidence interval has second-order accurate coverage. The results are numerically illustrated using the categorical distribution and logistic regression.
Ensemble learning, which combines multiple base learners to improve statistical prediction accuracy, is frequently used in statistical science and data mining. However, because of their “black box” nature, ensemble learning models are difficult to interpret. A recently proposed rule ensemble method known as RuleFit presents the base learner as a production rule and also generates a measure that influences the response variable. The RuleFit method for binary response applies a squared-error ramp loss function, and base learners are weighted by shrinkage regression using the lasso method. Thus, RuleFit is not constructed by a logistic regression model. Moreover, highly correlated pairs of base learners may be excessively pruned by the lasso method. In this study, we solved the excess pruning problem by constructing RuleFit within a logistic regression framework, weighting the base learners by elastic net. The effectiveness ofour proposed RuleFit model is illustrated through a real data set. In small-scale simulations, this method demonstrated higher predictive performance than the original RuleFit model.
Adachi (2013) showed that the EM algorithm for maximum likelihood (ML) factor analysis always gives a proper solution if positive unique variances are used as the initial values. This means EM has an advantage of always avoiding any improper solutions. However, it also creates a potential problem of not being able to detect an improper solution. To mend this disadvantage, we monitored the convergence process of the EM algorithm. We found that (i) the convergence rates for improper solutions were much slower than for proper solutions; (ii) the reciprocal of unique variance estimate responsible for an improper solution showed nearly perfect linear relationship with iteration numbers; (iii) for improper solutions, the maximum absolute changes (MAC) of unique variances raised to the power of -0.5 showed nearly perfect linear relationship with iteration numbers; (iv) for proper solutions, MAC of unique variances raised to the power of -(0.5)6 showed nearly perfect linear relationship with iteration numbers; (v) when the solution was proper, the ratio of two adjacent MAC of unique variances reached some constant value, whereas when the solution was improper, we did not find such a constant.