Ouyou toukeigaku

Contributed Papers

Multiple Tests Based on Arcsin Transformation in Multi-Sample Models with Bernoulli Responses

Taka-aki Shiraishi

2011Volume 40Issue 1 Pages 1-17
Published: 2011
Released on J-STAGE: March 21, 2012

DOIhttps://doi.org/10.5023/jappstat.40.1

JOURNAL OPEN ACCESS

Show abstractHide abstract

We consider multiple tests for the differences among propotions in k binomial populations. The simultaneous confidence intervals for all the pairwise differences among the propotions are expressed in Hochberg and Tamhane (1987). We may propose the Tukey-Kramer type multiple tests similar to the simultaneous confidence intervals. However the degree of the conservativeness for the multiple tests depends on unknown parameters. Therefore multiple tests based on arcsin transformation are proposed. It is shown that the degree of the conservativeness for the proposed tests is controled by the sized of the samples. For the multiple comparisons with a control, the multiple test procedures based on the Bonferroni inequality are stated in Shiraishi (2009), and Tanaka and Tarumi (1997). Using the arcsin transformation, the Dunnett-type multiple tests superior to the former tests are discussed. Furthermore the closed testing procedures more powerful than the Tukey-Welsch tests and the REGW tests are proposed. Lastly a sequentially rejective procedure is discussed.

View full abstract

Download PDF (700K)
Modified Rule Ensemble Method and its Application for Bioceutical Data

Toshio Shimokawa, Mitsuhiro Tsuji, Masashi Goto

2011Volume 40Issue 1 Pages 19-40
Published: 2011
Released on J-STAGE: March 21, 2012

DOIhttps://doi.org/10.5023/jappstat.40.19

JOURNAL OPEN ACCESS

Show abstractHide abstract

Ensemble learning methods can improve the prediction accuracy by combining multiple base learners, and are studied in the fields of statistics science and data mining. Since ensemble learning methods construct models of a “black box” nature, the models are difficult to interpret. Friedman and Popescu (2008) proposed the rule ensemble learning method, in which nodes of tree models are used as base learners. The rule ensemble method not only presents the base learner as a production rule, but also gives the response variable an influential measure with rule importance. In the rule ensemble method, base learners are weighted by shrinkage regression using the least absolute shrinkage and selection operator (lasso). However, when some pairs of base learners have high correlation, the lasso method prunes base learners excessively. In this study, we utilized elastic net (Zou and Hastie, 2006) for weighting the base learner to solve the problem of excessive pruning. We called our rule ensemble method the EN-RF method. Furthermore, we developed diagnostic graphics for partial variable importance and partial rule importance. The usefulness of the EN-RF method and its diagnostic graphics were illustrated by a practical example in medical data. In application of medical data, we focused on the characterization of the positive (and/or negative) responder. We found that the EN-RF method shows better performance compared with the existing regression method.

View full abstract

Download PDF (1608K)

Notes

Test for a Regression Parameter in a Logistic Regression Model under the Small Sample Size and the High Event Occurrence Probability

Masayuki Ohkura, Toshinari Kamakura

2011Volume 40Issue 1 Pages 41-51
Published: 2011
Released on J-STAGE: March 21, 2012

DOIhttps://doi.org/10.5023/jappstat.40.41

JOURNAL OPEN ACCESS

Show abstractHide abstract

When a logistic regression model is used under a small sample size and a high or a low event occurrence probability, it is important to confirm the existence of the complete or the quasi-complete separation. If the complete or the quasi-complete separation exists, a maximum likelihood estimator cannot be obtained. However, some statistical softwares such as SAS, S-PLUS or R execute an iteration method to obtain a maximum likelihood estimate. Commercial softwares present a result of the iteration with a warning message regarding the existence of the complete or the quasi-complete separation, or failing in convergence of the iteration. However, glm function implemented in R presents the result of the iteration with regard to the maximum likelihood estimate in spite of failing in convergence of the iteration. In this case, a standard error for the regression parameter estimate is very large. We show that it is possible to confirm the existence of the complete or the quasi-complete separation from the standard error for the regression parameter estimate. Firth (1993) suggested a method to eliminate a bias of the maximum likelihood estimator. As a result, Firth's method can estimate the regression parameter under the complete or the quasi-complete separation and it is possible to use Wald test using the standard error for the regression parameter estimate derived from Firth's method. However, Wald test using both the maximum likelihood method and Firth's method is very conservative under the small sample size and the high (or the low) event occurrence probability. The aim of this paper is to suggest a test for the regression parameter using the bootstrap method instead of Wald test under the small sample size and the high event occurrence probability that tends to near the complete or the quasi-complete separation. Under a null hypothesis, the probability of the type I error in the proposed method is compared with that in Wald test. We show that the proposed method for the slope parameter improves the type I error and assures the prescribed α significance level under the small sample size and the high event occurrence probability.

View full abstract

Download PDF (649K)
A Note on Analysis of Ratio of Two Correlated Normal Variables

Koko Asakura, Hiroyuki Uesaka, Tomoyuki Sugimoto, Toshimitsu Hamasaki

2011Volume 40Issue 1 Pages 53-71
Published: 2011
Released on J-STAGE: March 21, 2012

DOIhttps://doi.org/10.5023/jappstat.40.53

JOURNAL OPEN ACCESS

Show abstractHide abstract

Analysis of the ratio of two variables is common in many areas, with an assumption that the two variables are approximately bivariate-normally distributed. In medical science, many investigators prefer to evaluate the effect of treatments by comparing ``pre-treatment" with ``post-treatment", i.e., using the ratio of baseline and post-treatment values or the percent change of post-treatment values from baseline. For example, in clinical trials of patients with osteoporosis, percent changes in bone mineral density are frequently used to measure the efficacy of treatments. Some authors recommend the use of log-transformed ratio to stabilize the variance and induce normality. We numerically evaluate the distribution of ratio and log-transformed ratio and then show that the means of ratio and log-transformed ratio are biased. We perform a simulation study to determine how much such biases affect the size and power of tests used in one-sample and two-sample problems.

View full abstract

Download PDF (1377K)

Register with J-STAGE for free!