Abstract
A normal expansion method combined with MDL (Minimum Description Length) principle is proposed to construct an approximation of general distribution using its discrete data. Each of data is transformed to be a basic normal distribution and their composition is used to approximate the original distribution. Theoretically the composition is the same as the original distribution if the number of discrete data is infinite. For the finite quantity of data, a suitable approximation can be done if a variance is selected for each basic normal distribution to reach the MDL value. In this paper, the original distribution with single peak such as normal distribution, beta distribution etc., is firstly investigated and generated results show its feasibility and effectiveness. Then the original distribution with multiple peaks is approximated where the difference between use of identical variance and use of variant variance for each peak is also discussed. Finally as two applications, the smoothing of histogram and evaluation of data quantity are introduced to demonstrate utilization of proposed method.