Journal of the Japan Statistical Society, Japanese Issue

Article

On Divergence of the Saddlepoint Vector Field and Asymptotic Normality

Hiroyuki Takeuchi

Article type: research-article
2023Volume 53Issue 1 Pages 1-27
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.1

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper we shall investigate the state of convergence in law with a vector field of saddlepoint. The time fluctuation of these vector fields are visualized by 3-dim graphs. Through the perspective of fluid convergence, we can give a new interpretation that asymptotic normality is a phenomenon that the tangent vector space becomes uniform. If we extend the saddlepoint/inverse saddlepoint to a general dimensional space, it is pointed out that they have a Riemannian manifold structure in ℝ^d. And the tangent vector space of these manifolds are dual of each other. Especially the inverse saddlepoint manifold is a natural one which has a covariance matrix as the basis vector matrix of its tangent vector space, and an identification exists between the points and the normal distributions. These manifolds are attractive in a nonparametric approach to study the shape of probability distributions.

View full abstract

Download PDF (1087K)

Special Section: Recent Developments in Sparse Estimation: Methods and Theories

Information Criterion Based on SURE Theory for LASSO

Yoshiyuki Ninomiya

Article type: research-article
2023Volume 53Issue 1 Pages 29-47
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.29

JOURNAL FREE ACCESS

Show abstractHide abstract

Sparse estimation is now the standard method for regression analysis when there are many candidates for explanatory variables. On the other hand, for the selection of regularization parameter in sparse estimation, even if AIC for the sparse estimation is obtained in a simple form and the goal is to make a good prediction, the AIC is not always used, i.e., the standard method does not seem to be fixed. In this paper, we conduct numerical experiments to evaluate the performance of models estimated using a combination of LASSO and AIC in such a case, precisely the case where LASSO is used in normal linear regression analysis. Specifically, we first compare it with the combination of the ridge regularization and AIC, the best subset regression using the maximum likelihood method and ordinary AIC, or the combination of LASSO and the cross-validation, by evaluating the prediction squared error. In the cross-validation method, we will also check how much the estimation results differ depending on how the data is partitioned, by numerically evaluating the variability of the prediction squared error or the degrees of freedom of the selected model. The AIC for LASSO is derived from the SURE theory, but since it does not seem to be well known, it will be derived in a slightly generalized setting at the end.

View full abstract

Download PDF (722K)
High-Dimensional Nonlinear Feature Selection with Hilbert-Schmidt Independence Criterion Lasso

Makoto Yamada, Benjamin Poignard, Hiroaki Yamada, Tobias Freidling

Article type: research-article
2023Volume 53Issue 1 Pages 49-67
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.49

JOURNAL FREE ACCESS

Show abstractHide abstract

Variable selection is a significant research topic in the statistics, machine learning and data mining communities. In statistics, statistical methods based on sparse modeling and sure independence screening (SIS) are major research topics for feature selection problems. However, most of the feature selection methods developed in the machine learning community lack of theoretical guarantees. Hence, these feature selection methods have been overlooked by the statistics community, despite their good prediction accuracy usually obtained in real/simulated experiments. In this paper, we introduce the so-called Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso), a feature selection method widely used among the machine learning and data mining communities. First, we introduce the HSIC Lasso as a feature selection method and derive the related convex optimization problem. Then, we describe the Block HSIC Lasso procedure together with the related selective inference framework. Furthermore, we show that the HSIC Lasso is closely related to the nonnegative Lasso and the HSIC-based SIS. Finally, we provide some large sample properties of the HSIC Lasso.

View full abstract

Download PDF (753K)
Advanced Sparse Estimation Methods: With a Focus on Missing Data Analysis and Transfer Learning

Masaaki Takada

Article type: research-article
2023Volume 53Issue 1 Pages 69-89
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.69

JOURNAL FREE ACCESS

Show abstractHide abstract

Sparse estimation is widely used in data science as a parameter estimation method for high-dimensional data. However, in real-world data and problems, Lasso and other basic methods may not provide sufficient accuracy, computational efficiency, and stability. In this paper, we introduce recent developments in sparse estimation methods for real-world complex and difficult problems, with a particular focus on missing data analysis and transfer learning.

View full abstract

Download PDF (2511K)
Sparse Regularization Method in Marginal Regression Models

Yuta Umezu

Article type: research-article
2023Volume 53Issue 1 Pages 91-110
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.91

JOURNAL FREE ACCESS

Show abstractHide abstract

Sparse regularization methods, such as Lasso, now have become a kind of standard ones when the number of covariates is comparative to the sample size. Meanwhile, due to the estimation accuracy and the computational burden, it would often be effective to screen important variables in advance rather than to apply such method in any case, if the number of covariates are exponentially larger than the sample size. In this article, we propose a screening method in ultra high-dimensional scenario. We first explain a covariance based screening method is equivalent to some sparse regularization method for marginal regression models with Lasso-type penalty. We also show the such screening methods enjoys the sure screening property, that is, the selected model includes underlying target model with high probability. In order to choose tuning parameter without any arbitrarily, we consider the choice based on multiple testing and explain the selected model would not be much large with high probability. Furthermore, we compare the performance of several screening methods through simulation studies and apply a covariance based screening method to leukemia dataset.

View full abstract

Download PDF (772K)
Reconstruction of Sparse Signals by Minimization of Nonconvex Penalties

Ayaka Sakata, Tomoyuki Obuchi

Article type: research-article
2023Volume 53Issue 1 Pages 111-137
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.111

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we introduce a signal recovery method using nonconvex sparse penalties in compressed sensing. We deal with nonconvex sparse penalties called Smoothly Clipped Ab solute Deviation (SCAD) and Minimax Concave Penalty (MCP). The form of these penalties change depending on the regularization parameter, which we call the nonconvexity parameter,and SCAD and MCP coincide with the ℓ₁penalty at a certain limit of the nonconvexity param eter. We introduce an approximate message passing (AMP) algorithm to solve the minimiza tion problem of the nonconvex sparse penalties. The corresponding theoretical analysis shows that the nonconvex penalty minimization method gives better recovery performance than the ℓ₁-minimization method as the nonconvexity parameter decreases. However, the small nonconvexity parameter induces the difficulty in the convergence of AMP. We show that this difficulty is caused by the vanishing basin of attraction to the fixed point of AMP, and mitigate the difficulty by introducing a method called non-convexity control (Sakata and Obuchi (2021)).

View full abstract

Download PDF (1024K)
Post-Selection Inference for Sparse Estimation

Joe Suzuki

Article type: research-article
2023Volume 53Issue 1 Pages 139-167
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.139

JOURNAL FREE ACCESS

Show abstractHide abstract

The core theory of Post-Selection Inference (PSI) in sparse estimation is explained as simply as possible. The first half describes Lasso’s polyhedral theory of PSI (Lee et al. (2016)), how to find conditional distributions, and PSI in Forward stepwise. In the latter half, after describing the difference in operation between Lasso and LARS, the principle of Significance Test (Lockhart et al. (2014)) and Spacing Test (Tibshirani et al. (2016)) is described based on it.This is a review paper.

View full abstract

Download PDF (966K)

Special Topic: The JSS Prize Lecture

On Attraction of Taguchi Method

Masami Miyakawa

Article type: research-article
2023Volume 53Issue 1 Pages 169-183
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.169

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper reviews my several works regarding Taguchi method. Firstly, confounding pat terns between main effect and interaction in mixed level orthogonal arrays are clarified. Sec ondly, a new decomposition of total sum of squares in two-way layout composed of signal factor and noise factor is shown. Numerical example is also discussed. Thirdly, statistical tests for signal-to-noise ratios are investigated based on classification for scale of signal and response variable.

View full abstract

Download PDF (764K)
Prediction and Estimation Based on Statistical Models –Considerations Using Poisson Models–

Fumiyasu Komaki

Article type: research-article
2023Volume 53Issue 1 Pages 185-204
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.185

JOURNAL FREE ACCESS

Show abstractHide abstract

We investigate the theory of predictive distributions, which frames statistical inference from the predictive viewpoint. The performance of a predictive distribution is evaluated by the Kullback-Leibler divergence. Bayesian estimation is formulated as a limit of Bayesian prediction for the multidimensional normal models and the multidimensional Poisson models. The choice of a prior distribution is pivotal in Bayesian estimation, and consequently, there is a wealth of studies on noninformative or shrinkage prior distributions. By emphasizing the relationship between prediction and estimation, we demonstrate how insights from Bayesian estimation can be applied to Bayesian prediction, leading to a novel understanding of Bayesian estimation through Bayesian prediction. To elucidate this relationship, we employ examples based on multidimensional Poisson distributions.

View full abstract

Download PDF (693K)

Special Topic: The JSS Ogawa Prize Lecture

Introducing Results Associated with Poisson Approximations for the Ewens Sampling Formula with Large Parameters

Koji Tsukuda

Article type: research-article
2023Volume 53Issue 1 Pages 205-225
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.205

JOURNAL FREE ACCESS

Show abstractHide abstract

From the viewpoint of Poisson approximations, we introduce properties of the Ewens sam pling formula based on the paper Tsukuda (2019, The Annals of Applied Probability 29, no. 2, 1188–1232). The Ewens sampling formula is a basic distribution of random partitions, and has two parameters n ∈ ℕ and θ > 0. In this paper, considering a n-dimensional random variable Cⁿ= (C₁ⁿ, ..., C_nⁿ) following the Ewens sampling formula, we focus on asymptotic properties of its b-dimensional marginal C_bⁿ= (C₁ⁿ, ..., C_bⁿ) (b = 1, 2, ...) and the sum K_n= ∑_i₌₁ⁿC_iⁿwhen two parameters simultaneously increase.

View full abstract

Download PDF (726K)
Beta-Gamma Time Series Models with Discounting

Kaoru Irie

Article type: research-article
2023Volume 53Issue 1 Pages 227-246
Published: September 07, 2023
Released on J-STAGE: September 07, 2023

DOIhttps://doi.org/10.11329/jjssj.53.227

JOURNAL FREE ACCESS

Show abstractHide abstract

We review the properties and applications of the beta-gamma model, a Markov process whose marginal distribution is a gamma distribution at each time point. For the conjugacy of gamma distributions as the priors for the Gaussian precision and Poisson rate, the beta-gamma model has been used as the state equation of the Gaussian/Poisson state space models, enabling fast,analytical computation of the online and retrospective posterior distributions and predictive distributions. Typical applications include the sequential analysis of volatilities and steaming count-valued data.

View full abstract

Download PDF (1137K)

Register with J-STAGE for free!