Abstract
Deduplication backup technology removes redundant data segments over the system to reduce the capacity in target backup storage. This technique also provides better performance, less resource utilization, less energy consumption and TCO. This paper describes an optimization method for deduplication backup in IT system in which multiple duduplication processes are simultaneously installed and activated. The method provides the assignment algorithm of backup target files to installed deduplication processes to maximize aggregate deduplication ratio under keeping the predefined system requirements, such as Backup-windows limitation, Resource utilization limitation. In the practical system, the consuming time to perform deduplication in each process is not constant, but variable caused by several operational resource contention, waiting time in queues, characteristics of data and so on.
The proposed method point out these time parameters can be simulated to follow a normal distribution, formularize the discrete assignment programming, then define the combinatory approaches of integer linear programming to incorporate the maximal deduplication ratio and the binary adjustment of parameters to ensure being within the variance. By applying the method, system can achieve the maximal deduplication ratio under keeping the time requirements with predefined tolerance. Effectiveness of the method is proven by simulation.