2017 Volume E100.D Issue 11 Pages 2702-2710
Soft-thresholding is a sparse modeling method typically applied to wavelet denoising in statistical signal processing. It is also important in machine learning since it is an essential nature of the well-known LASSO (Least Absolute Shrinkage and Selection Operator). It is known that soft-thresholding, thus, LASSO suffers from a problem of dilemma between sparsity and generalization. This is caused by excessive shrinkage at a sparse representation. There are several methods for improving this problem in the field of signal processing and machine learning. In this paper, we considered to extend and analyze a method of scaling of soft-thresholding estimators. In a setting of non-parametric orthogonal regression problem including discrete wavelet transform, we introduced component-wise and data-dependent scaling that is indeed identical to non-negative garrote. We here considered a case where a parameter value of soft-thresholding is chosen from absolute values of the least squares estimates, by which the model selection problem reduces to the determination of the number of non-zero coefficient estimates. In this case, we firstly derived a risk and construct SURE (Stein's unbiased risk estimator) that can be used for determining the number of non-zero coefficient estimates. We also analyzed some properties of the risk curve and found that our scaling method with the derived SURE is possible to yield a model with low risk and high sparsity compared to a naive soft-thresholding method with SURE. This theoretical speculation was verified by a simple numerical experiment of wavelet denoising.