Abstract
This paper tries to explain mathematical foundation for statistical inference using hierarchical parametric models. By using Sato-Bernstein's b-function, we show that the asymptotic form of the average Kullback distance between the true distribution and the Bayesian estimated one is equal to λ1/n - (m1 - 1)/(nlogn), where n is the number of empirical samples. We also show that the constant values λ1 and m1, which are invariant under the bi-rational transforms, can be calculated by resolution of singularities in algebraic geometry, and that λ1 is smaller than the number of parameters. Even in the case when the true distribution is not contained in the parametric model, hierarchical models with Bayesian estimation are better learning machines than regular statistical models.