ABSTRACT Evaluating the relationship between a response variable and explanatory variables is an important part of establishing better statistical models. Concordance provides a measure for this relationship. In this study, we estimate the concordance for time to-event data, which often occur in medical sciences. In general, censored cases are observed in the data set. Moreover, the distribution of the censoring time usually varies among studies, even when a target population is the same. Hence, it is desirable that we reduce the effect of the censoring distribution when estimating the concordance.Here, we propose estimators of the concordance based on cross-validation. These esti mators can reduce the optimistic bias originating from plugging in the estimators of the censoring distribution and the parameters of a model. In addition, we present numer ical experiments to illustrate the properties of the proposed estimators in comparison to existing estimators.
ABSTRACT Novel simulation studies are performed to investigate the performance of likelihood-based and entropy-based information criteria for estimating the number of classes in latent growth curve mixture models, considering inŽuences of true model complexity and model misspecification.Simulation results can be summarized as (1) Increased model complexity worsens the performance of all criteria, and this is salient in Bayesian Information Criteria (BIC) and Consistent Akaike Information Criteria (CAIC). (2) The classification likelihood information criterion (CLC) and integrated completed likelihood criterion with BIC approximation (ICL.BIC) frequently underestimate the number of classes. (3) Entropy-based criteria correctly estimate the number of classes more frequently. (4) When a normal mixture is incorrectly fit to non-normal data including outliers, although this seriously worsens the performance of many criteria, BIC, CAIC, and ICL.BIC are relatively robust. Additionally, overextracted classes with trivially small mixture proportions can be detected when the sample size is large. (5) When there is an upper bound of measurement, although this worsens the performance of almost all criteria, entropy-based criteria are robust. (6) Although no single criterion is always best, ICL.BIC shows better performance on average.
ABSTRACT Rukhin et al. (2010) proposed the non-overlapping template matching test as one of methods for statistical testing of randomness in cryptographic applications. This test is the very interesting, but statistical properties of this test and any methods on setting the template have not been shown. Our new contribution in this paper is to propose a modi?ed version of this test including the setting of the template and to show how this modi?ed test works effectively by some simulation studies.
ABSTRACT Analysis based on interval-valued symbolic variables, which are given as p-dimensional hyperrectangles in ℜp, is considered appropriate in some scenarios. However, the methods analyzing these variables are not as well studied as those for classical variables, which are given as single points in ℜp. The regression tree, which is constructed using the CART algorithm, is one such example, and we consider it in this paper. To construct a regression tree based on interval-valued symbolic variables, several models are considered. Our proposed model is different from the other models,because, in this model, a concept can be included in several terminal nodes in a tree. If we want to construct a regression tree using the proposed model, several problems such as the representation method of predictive models in each node and searching an optimal splitting point in interval values, should be addressed. We address these problems and present an application of this model in reference to the study of HIV-1-infected patients data.
ABSTRACT In each SIMD (Single Instruction, Multiple Data) group, called a `warp of a GPU (Graphics Processing Unit), all the ?xed number of threads execute the same instruction concurrently at each unit period of time. We consider a class of probabilistic algorithms designed for use on GPUs, including a wide variety of Monte Carlo methods,such that each thread contains a loop iterated stochastically variable times, and that the life-cycle of a warp ends when the slowest thread completes its requested task.A run-time model is proposed in order to explain the distributions of execution time observed in SIMD parallel computations using the algorithms of this class. Asymptotic properties of those distributions are also presented.