ISIJ International
Online ISSN : 1347-5460
Print ISSN : 0915-1559
ISSN-L : 0915-1559
Regular Article
On-line Energy Allocation Based on Approximate Dynamic Programming for Iron and Steel Industry
Yanyan Zhang Qingxin GuoLixin Tang
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML

2016 Volume 56 Issue 12 Pages 2214-2223

Details
Abstract

Energy allocation in iron and steel industry is the assignment of available energy to various production users. With the increasing price of energy, a perfect allocation plan should ensure that nothing gets wasted and no shortage. This is challenging because the energy demand is dynamic due to the changes of orders, production environment, technological level, etc. This paper try to realize on-line energy resources allocation under the situation of dynamic production plan and environment based on typical energy consumption process of steel enterprises. Without definite analytical model, it is a tough task to make the energy allocation plan tracks the dynamic change of production environment in real time. This paper proposes to deal with dynamic energy allocation problem by interactive learning with time-varying environment using Approximate Dynamic Programming method. The problem is formulated as a dynamic model with variable right-hand items, which is an updated energy demand obtained by on-line learning. Reinforcement learning method is designed to learn the energy consumption principle from the historical data to predict energy consumption level corresponding to current production environment and the production plan in future horizon. Using the prediction results, on-line energy allocation plan is made and its performance is demonstrated by comparison with static allocation method.

1. Introduction

Energy is the foundation of modern industry and is facing huge and tremendous demand increase by the driving of the sustained economic growth, which results in rapid demand increase of primary and secondary energy. As one of the typical energy-intensive industry, the energy consumption level of an iron & steel enterprise accounts for about 15% of the countrywide quantity of China. Traditional research on iron and steel planning and optimization usually focuses on production itself.1) However, with the increase of energy price, the cost of energy consumption covers 20–35% of the whole iron & steel production cost. And, the comprehensive energy consumption of China is much higher than that of the world with the energy utilization being only 30%. As a common indicator to measure the energy consumption level of iron & steel industry, the comprehensive energy consumption per ton steel of China is much higher than that of developed countries such as United States, Japan, South Korea, etc. Therefore, there is considerable optimization potential.

Iron & steel production is a complicated system with multi-process, multi-equipment and multi-energy. The production and energy consumption interrelates and interact with each other in iron & steel industry that the consumption of more than 20 kinds of energy media including electricity, gas, steam, water, oxygen, etc. are consumed, converted and regenerated simultaneously. During this course, the energy demand of each process, the energy supply capacity, the holding capacity, the safety, the emission restriction and the dynamism of all aforementioned need be taken into account. Unthoughtful allocation and use of energy will undoubtedly leads to the more cost of products, more pollution, more emission, and so on. Therefore, it has been a major task for energy management department to insure continuous, safe, and economical energy supply, and efficient energy utilization. The development of energy-saving strategy has become an increasingly prominent task, which can be accomplished by implementing technological progress and equipment renovation, or by improving management level. The implementation of the former two strategies usually involves huge cost due to equipment and production technology replacement, while the last strategy aims at reducing energy consumption level by exploring advanced management tools, which from the view of cost, are feasible ways to improve the utilization rate of energy in iron & steel industry.

Currently, the common challenge that most iron & steel enterprise confronts is no deterministic model or rule about energy consumption and the deficiency of energy allocation from the view of the whole factory. The current situation of traditional energy management can be summarized as follows. First, energy is allocated based on the production plan at the beginning of the production horizon and remains unchanged till the end of the horizon. Second, allocation of energy usually focuses on a single type, for example, gas system, electricity system, etc. In real-world industry, any change such as machine breakdown, the arrival of rush order and temporary change of orders, the instability of workers operating level, the change of equipment operating status will make the energy production environment deviate from original status, which will affect the energy demand. And, coupling among different types of energy exists along with the production all the time, for example, during the course of consuming primary energy (electricity, coal, nature gas, etc.) in ironmaking, steelmaking and rolling process, secondary energy (coke oven gas, blast furnace gas, Lindz.Donawitz gas, steam and so on) are generated, and extra gas and steam can be used to generate electricity.

The key task of energy allocation is to comprehensively consider the consumption and regeneration in the whole production horizon. The goal of energy allocation is to reduce the total production cost while guarantee the smooth production. Traditional research on energy allocation method, static or dynamic, usually assume known and fixed parameters in the horizon, however, dynamic energy demand in almost all practical situation requires the allocation plan of energy closely track the change of production plan so as to increase the utilization rate. Without considering the above mentioned energy features during the course of energy allocation will inevitably lead to energy wastage and emission. Therefore, traditional fixed-demand based method no longer meets the practical requirements, and the research on new on-line allocation of energy is needed.

The idea of using real-time information to make decision can be found in the research of inventory management, that is, design efficient method to obtain accurate demand estimation in future lead time when signing order contract.2,3,4,5) In production area, some rescheduling methods have been reported to deal with unexpected real-time events.6) However, the research with real-time information updates has not been found, and most reports on energy allocation in production focus on deterministic case. Boukas and Haurie7) studied electric arc furnace scheduling problem with energy constraints consideration. The objective is to determine the start time for processes in each production cycle subject to energy control constraints, priority constraints and availability constraints of shared equipments. Hierarchical method is used for solving. Javier8) researched the jobshop facility integrated manufacturing and control system under electricity real-time-pricing in his dissertation. The effect of energy and waste on production scheduling is taken into account to minimize the total production cost including energy cost. Ruiz et al.9) studied real-time on-line optimization of the utilities system of a production site, the optimization objective is the overall system cost minimization under the constraints of equipment, fuels and electricity pricing, emissions limits, quotas and rights. Applications using real-world data in refineries and chemical plants demonstrated the optimization results. Cai et al.10) developed a large-scale dynamic optimization model for a long-term energy systems planning. The model describes the energy management system as networks of a series of energy flows, transferring in/out energy resources to end users over a number of periods. The system is helpful for tackling dynamic and interactive characteristics of the energy management system in examining impacts of energy and environmental policies, regional development strategies and emission reduction measures.

The on-line energy allocation problem in this paper originates from the actual need of typical energy-intensive process in iron & steel plant. This research tries to allocate energy among all users in real time, considering the consumption of primary energy, the regeneration and conversion of secondary energy, and the dynamic features of production environment. The dynamism caused by the above-mentioned changes and the coupling or uncertainty among these factors will affect the energy demand and consumption level, only the latest information-based energy allocation plan could meet the requirement the practical production. This is also the motivation and focus of this paper.

Some operations in iron and steel production will last a few hours, wherein production status changes frequently. Thus the on-line energy allocation belongs to multi-stage sequential decision problem. Dynamic Programming (DP) based solving method is usually used to deal with multi-stage optimization problems. DP has wide application in the area of engineering, operations research, economics.11,12,13,14) However, DP records all suboptimal decision for all possible state from a certain time point to the end of the horizon, the burden of computation and storage increases exponentially with the state space dimension.15) This kind of “curse of dimensionality” limits DP from large scale application.16) In recent years, to overcome “curse of dimensionality”, Approximate Dynamic Programming (ADP) has gradually attracted researchers attention with the basic idea of replacing exact computation of all states at each stage with estimated cost function.17) Generally, ADP uses Cost-to-go function approximation18) to construct parameters from sample data in discrete state space, and gradually improves its performance through on-line interaction with environment. In ADP, critic module approximates cost-to-go function, and the action module generates action. The critic module evaluates the performance of action, based on which the action module carries out adjustment. Therefore, ADP method will not be affected by inaccurate parameters and is appropriate for solving problems with unknown models. Schmid19) designed an ADP-based optimization method for ambulances real-time scheduling problem, which is featured as: an appropriate vehicle needs to be dispatched and sent to the required sits immediately once a request emerges, and after having completed a request the vehicle should be relocated to its next waiting location. The test on real-world data of a city shows that compared with traditional assignment rules ADP improved the average response time by 2.89%. Ganesan et al.20) addressed the taxi-out time prediction problem in airport and developed ADP-based solving method. The experimental results of 5 US airports shows 14–50% improvement compared with regression models. Simao et al.21) used the largest truckload motor carrier in the United States as the background and studied the large-scale fleet scheduling problem, considering multiple dynamic characteristics (driver type, location, domicile, and work time constraints, etc.). The authors combined ADP with mathematical programming, which provided satisfying solutions with good performance in the costs of hours-of-service, cross-border driver management, hiring drivers, etc.

To sum up, ADP exhibits advantages in solving real-time or dynamic problems, implying that it is suitable for the on-line energy allocation problem studied in this paper. Most research on energy allocation is based on given demand (which is usually obtained by estimation or simple computation, for example, weighted average) and lack of real-time interaction with environment. In real world, in the process of production plan execution, production situation is difficult to remain the same as the beginning when time moves forward, naturally the energy demand will change accordingly and the initial allocation plan is no longer accurate. This paper proposes a demand updates based on-line energy allocation method, which is a rolling prediction method taking full advantage of ADP’s superiority in dealing with time-varying problems. On-line energy allocation model with variable right hand items is established, ADP is used to estimate the energy demand based on the latest production information, then the updated demand prediction results are used to generate energy allocation among all users.

2. Problem Description and Modeling of On-line Energy Allocation

In iron & steel industry, there are usually 6 energy-intensive processes - Coking, Sintering, Ironmaking, Steelmaking, Hot rolling and Cold rolling, as shown in Fig. 1. Different processes need different types of energy to ensure the production. Some types of energy, like electricity and gas, are required by almost all processes, others like oxygen, are required by ironmaking, steelmaking. Energy allocation problem can be described as follows: Dynamically allocate each kind of energy media among users in given horizon so as to minimize the total energy consumption, considering primary energy, secondary energy and energy conversion simultaneously.

Fig. 1.

Illustration of energy consumption and regeneration in all processes of iron & steel enterprise.

The task of energy prediction is to provide the energy management department with accurate amount of energy needed in certain time periods in future. To make the energy allocation plan closely follow up with any change of production environment, this paper proposes an on-line energy allocation method as follows: the planning horizon is divided into limited time periods. The real-world energy consumption data, the production plan and the environment information change with time periods, based on which energy demand prediction in each period is implemented and embedded in the on-line energy allocation plan. In general, the on-line energy allocation problem is learning and dynamic programming-based optimization. The overall structure of the proposed framework is based on dynamic programming, at each stage of which a linear programming model is formulated to express the allocation subproblem, and the real-time energy demand is obtained by the embedded ADP prediction module. In nature, this framework is environment interaction-based allocation method. The basic idea of energy allocation at each stage is shown by Fig. 2.

Fig. 2.

The basic idea of energy allocation at each stage of on-line energy allocation.

As the core part of the proposed framework, energy demand prediction estimates the quantity of energy needed to complete the future production based on history energy consumption data and real-time information. The accuracy of prediction is affected by many factors. First, production data that approaches the prediction time period expresses current production status more accurately than those faraway ones, therefore, using “near” data may bring relative accurate prediction results. Second, the prediction accuracy is affected by the design of input & output variables and the method of modeling. To determine the input & output variables, the factors that affect energy consumption level should be figured out through analysis or experiments. Based on analysis and practical experience, the quantity of energy used has close relationship with production conditions among which production yield and air temperature have significant impact. The role of energy is to serve and guarantee the production, then it is easy to understand that the consumption level of all kinds of energy is directly proportional of the quantity of yield. More production means more consumption of gas, electricity, water and so on. In addition, iron & steel production is temperature-related and sometimes the hating process starts from air temperature, therefore the difference of air temperature has significant impact on the demand of gas or electricity. Consequently, in the prediction model, planned yield and air temperature are designed as the input variables, and the items (include demand and regeneration) to be predicted as the output variables.

The on-line energy demand prediction in this paper is carried out by ADP, which interacts with environment in a dynamic manner to realize estimation. The implementation of ADP composes of two stages, learning and testing. At the stage of learning, the ADP learning procedures are designed to learn the consumption principle from the sample data, which are updated in a rolling way with time so as to approach the prediction environment as close as possible. At the stage of testing, the energy demand is estimated based on the learnt results, the production plan and other environmental information.

2.1. Mathematical Model for Energy Allocation

The parameters of on-line energy allocation problem are listed as follows:

t: Time period index, t = 1,2,…, T, T is the planning horizon

i: Process index, i = 1,2,…, I, I is the number of processes

j: Energy media index, j = 1,2,…, J, J is the number of energy media

xijt: Allocated quantity of energy j in time period t at process i

cjt: Unit cost of energy j in time period t

Sjt: Supply quantity of energy j in time period t

αij: Secondary energy productive rate of generating energy j at process i

p ijt s : Unit penalty for shortage of energy j in time period t at process i

p ijt e : Unit penalty for excess of energy j in time period t at process i

dijt: Demand of energy j in time period t at process i

The purpose of on-line energy allocation is to achieve optimal allocation and improve energy efficiency. The objective is set as: minimization of energy consumption cost, energy shortage penalty and energy waste penalty.   

minz= t=1 T i=1 I j=1 J c jt x ijt + t=1 T i=1 I j=1 J p ijt s max{ 0, d ijt - x ijt } + t=1 T i=1 I j=1 J p ijt e max{ 0, x ijt - d ijt } (1)

The constraints considered are energy supply constraints (2) and variable values constraints (3). Constraints (2) require that in any time period, for any kind of energy media, the consumption quantity cannot be more than the available quantity, where the available quantity comprises of the supply capacity in current time period and the secondary energy generated in immediately previous time period.   

i=1 I x ijt S jt + i=1 I α ij x ij,t-1 ,   j,    x ijt | t=0 =0 (2)
  
x ijt 0, i,j,t (3)

The energy allocation problem expressed by (1), (2) and (3) involves time-based multi-stage decision and can be solved by Dynamic Programming using the following state transition equations:   

f t ( S t ) = min x ijt X ijt [ c t ( S t , x ijt ) + f t-1 ( S t-1 ) ] f 0 ( S 0 ) =0 (4)
  
c t ( S t , x ijt ) = i=1 I j=1 J c jt x ijt + i=1 I j=1 J p ijt s max{ 0, d ijt - x ijt } + i=1 I j=1 J p ijt e max{ 0, x ijt - d ijt } (5)

Where, ct(St,xijt) is the stage value of energy consumption cost, energy shortage penalty and energy waste penalty. Traditionally, the above state transition equations of DP are based on static parameters, meaning that the parameters in (4) and (5) remain unchanged from the beginning to the end of the time horizon. To deal with real-world dynamic energy allocation problem with time varying parameters, ADP-based solving method is proposed and elaborated in subsection 2.2 and section 3.

2.2. Mathematical Model for Energy Demand Prediction

The problem presented in subsection 2.1 comprises multi-stage decisions, each of which is based on the real-time energy demand prediction. To obtain the value of dijt in each period t, ADP algorithm is designed to realize on-line energy prediction. ADP algorithm is characterized as self-learning and self-adaptive, making it suitable for time-varying complex system and dynamic complicated task, especially for the problems without analytical model. Based on such features, ADP fits well for the addressed on-line energy allocation problem. The details are as follows.

Parameters:

α0 Initial value of learning parameter

Kα A big number that denotes the decay rate of learning parameter

P0 Initial value of exploration probability

KP A big number that denotes the decay rate of exploration probability

εE Threshold value of exploration stopping criterion

εL Threshold value of learning stopping criterion

γ Discount factor

k Prediction horizon

To solve a problem, ADP usually first analyze and find state variables that could well express problem features, then design value function, action and contribution for each state. Based on the analysis of input & output variables, the system state of process i in time period t is set as: s it ={ s 1it , s 2t } for energy demand prediction, where s it is the system state space of process i in time period t, s1it is the yield of process i in time period t and s2t is the air temperature in time period t. The action is denoted by ait, which corresponds to the predicted value of dijt in time period t at process i. The system state transition function is defined as:   

s i,t+1 = S M ( s it , d ijt , W i,t+1 )

Where Wi,t+1 denotes all the information that arrives between period t and t+1. Since accuracy is the most important performance index for energy demand prediction problem, the contribution function dt ( s it , d ijt ) is defined as: the absolute value of the difference between the actual value d ijt ACT and predicted value d ijt PRE with respect to the demand of process i, energy j in time period t, that is,   

d t ( s it , d ijt ) =| d ijt ACT - d ijt PRE | (6)

With the discount factor γ(0 < γ < 1), the objective is to minimize the expectation of the sum of discount prediction deviation.   

min d ijt D ijt E{ t=0 γ d t ( s it , d ijt ) } (7)

Where Dijt is the action space. P ( s | s it , d ijt ) is the state transition probability at state s′. To solve (5), Bellman equation is adopted to establish the following model:   

V t ( s it ) = min d ijt D ijt [ d t ( s it , d ijt ) +γE{ V t+1 ( s i,t+1 ) | s it , d ijt } ] = min d ijt D ijt [ d t ( s it , d ijt ) +γ( s S P( s | s it , d ijt ) V t+1 ( s i,t+1 ) ) ] (8)

For most problems, it is difficult to obtain the knowledge of state transition probability P ( s | s it , d ijt ) , thus Robbins-Monro stochastic approximation method based ADP algorithm is used to realize the prediction of (8), that is, to obtain the value of dijt for each type of energy media in each process. To simplify the derivation procedures of solvable expression of Vt( s it ) for each process i in time period t, we use st as the simplified expression of sit and use at to simplify ait (the action to determine the predicted value of dijt, the amount of energy j to be used at process i in time period t) from Eqs. (9), (10), (11), (12), (13), (14).   

V t ( s t ) = min a t A t ( d t ( s t , a t ) +γE{ V t+1 ( s t+1 ) | s t } ) = min a t A t ( d t ( s t , a t ) +γ s S P( s | s t , a t ) V t+1 ( s ) ) (9)

At each decision point, it is difficult to solve exactly the value function at state st because of computational burden. Iterative estimation is used to approximate the value function. Let V ¯ t (st) be the approximation of value function obtained by iterative estimation, V ¯ t 0 be the initial approximate value, and V ¯ t n-1 be the approximate value after n−1 iterations. At the beginning of iteration, the value of action a0 is obtained by solving the following equation,   

a 0 = argmin a A 0 ( d 0 ( s 0 ,a ) +γE{ V ¯ 1 ( s 1 ) | s 0 } ) = argmin a A 0 ( d 0 ( s 0 ,a ) +γ s S P 0 ( s | s 0 ,a ) V ¯ 1 ( s ) ) (10)

By state transition function st+1=SM (st,at,Wt+1), the next state s1 is get based on the initial state s0, then iterative decision is carried out. According to Bellman equation with discount factor, the optimal value function V t * (st) at state st can be calculated by,   

V t * ( s t ) = min a A t ( d t ( s t ,a ) +γ( s S P( s | s t ,a ) V t+1 * ( s ) ) ) (11)

According to Bellman theory, for each state-decision tuple (s, a), the optimal value function is obtained by solving the following equation:   

V * ( s,a ) = s S P( s | s,a ) d( s,a ) +γ[ s S P( s | s,a ) min a A V * ( s , a ) ], s, a (12)

Here, s′ and s″ are the possible states obtained by implementing action a at state s. P( s | s,a ) and P( s | s,a ) the corresponding state transition probabilities. a″ is the action that minimize the value function at state s″. Considering the difficulty of obtaining the knowledge of state transition probability, the learning version of (12) is expressed as:   

V t n ( s,a ) =( 1- α n-1 ) V t n-1 ( s,a )    + α n-1 [ d t ( s,a ) +γ min a A V t+1 n-1 ( s , a ) ], s, s SaA (13)

Where α (0<α<1) is the learning parameter, which is updated by:22)   

α n-1 = α 0 1+u ,   u= ( n-1 ) 2 K+n-1 (14)

K is a big number, α0 = 0.7.13)

Finally, based on learning version of value function approximation in (13) and the design of state variables for energy prediction problem, the learning version of value function for state-action 2-tuple ( s it , d ijt ) is as follows:   

V t n ( s it , d ijt ) =( 1- α n-1 ) V t n-1 ( s it , d ijt ) + α n-1 [ d t ( s it , d ijt ) +γ min a D ijt V t+1 n-1 ( s i,t+1 , a ) ] (15)

Where, s it , s i,t+1 S,    d ijt D ijt . The learning parameter α decays according to (14).

The ADP-based learning procedures can be illustrated by Fig. 3.

Fig. 3.

The flow chart of ADP-based learning procedures.

3. Design of Prediction Algorithm Based on ADP

In this section, the detailed procedures of ADP-based energy allocation algorithm will be presented. To obtain real-time energy demand, first, the system state, action, contribution function and other parameters of energy prediction are designed. After a specific initialization, the learning and testing stages of the proposed algorithm are elaborated. The purpose of initialization is to assign initial values to the actions with respect to value functions of all initial states. At the stage of learning, sample data is fed to ADP one by one and iteratively until the improvement of value function stops. At the stage of testing, new input of the state to be predicted will be given to the ADP to get energy demand, based on which the energy allocation in corresponding time period is carried out. The detailed steps of on-line energy allocation are as follows.

Step 1 Determine state, action and contribution function of energy prediction problem. Set prediction horizon and the length of the time window.

Step 2 Determine and discretize the state space and the action space, the value functions V ( s it , d ijt ) corresponding to all state-action pair ( s it , d ijt ) are initialized to 0.

Step 3 Let n = 1, exploration stage begins, get the exploration probability P, if PεE then go to Step 4; otherwise generate a random number m from a uniform distribution, then choose any action d ijt D ijt when m < P (P is the exploration probability), or choose greedy action d ijt D ijt (the action corresponding to minimal V value). When a state s it is visited for the first time, choose action d ijt D ijt randomly since V ( s it , d ijt ) is initialized to 0. The value of exploration probability P is updated by,   

P n-1 = P 0 1+ u P ,    u P = ( n-1 ) 2 K P +n-1
Calculate contribution ct ( s it , d ijt ) , get next state s i,t+1 , update V ( s it , d ijt ) according to (14), n=n+1, go to Step 5.

Step 4 Start learning, let n=1, choose action dijt from Dijt using greedy criterion, calculate contribution ct ( s it , d ijt ) , get next state s i,t+1 , update V ( s it , d ijt ) by (13), n=n+1, go to next step.

Step 5 Continue learning with training matrix, Learning ends when V ( s it , d ijt ) converges ( | V n - V n-1 | ε L ) for each 2-tuple ( s it , d ijt ) , then go to next step.

Step 6 After learning, the obtained V ( s it , d ijt ) table includes optimal action for each state, then each unknown variable dijt is solved by looking up the V ( s it , d ijt ) table based on the corresponding system state s it .

The overall scheme of dynamic demand update-based ADP algorithm for on-line energy allocation can be summarized as: Dynamic programming (DP) method is adopted at the outermost layer of the scheme so that the planning horizon is divided into multiple stages, each of which corresponds to a subproblem that is formulated as a linear programming (LP) model based on the prediction result of ADP. The subproblem can be optimally solved by CPLEX. ADP is used at each stage to accomplish exact estimation of demand through iterative learning because it is difficult to obtain accurate energy demand with respect to the updated production environment without necessary knowledge of energy consumption principle.

4. Experimental Results Based on Real-world Data

In order to demonstrate the performance of the proposed ADP algorithm for on-line energy allocation problem, the real-world energy data in a Chinese iron & steel enterprise is used. Six production processes including sintering, coking, ironmaking, steelmaking, hot rolling and cold rolling are considered. The proposed algorithms have been implemented using Language C++ on a PC with Intel (R) Core (TM) 2 Duo CPU (2.33 GHz) and Windows XP operating system. The distribution of energy consumption and regeneration among production processes considered in this paper is shown by Table 1, in which the abbreviations are: BFG - Blast Furnace Gas, COG - Coke Oven Gas, LDG - Lindz.Donawitz Gas, LO2- Low-pressure oxygen, HO2 - High-pressure oxygen, Ar – Argon, MSteam - Medium-pressure steam, LSteam - Low-pressure steam.

Table 1. Distribution of energy consumption and regeneration among all considered production processes.
ProcessBFGCOGLDGLO2HO2N2ArMSteamLSteamElectricity
Sintering+
Coking√ +√ +
Ironmaking√ +√ +
Steelmaking√ +√ +
Hot rolling+
Cold rolling

‘√’ means consumption of the energy in the corresponding process, ‘+’ means regeneration of the energy in the process

4.1. Performance of ADP for Prediction

As the core part of on-line energy allocation approach, the accuracy of energy demand prediction will directly affect the final performance. The real-world energy data of steelmaking process (shown in Table 2) is used to show the prediction results.

Table 2. Real-world data of energy consumption in steelmaking process.
tS1itS2tLDGHO2N2ArLSteamElectricity
(ton)C)(m3)(m3)(m3)(m3)(kg)(kwh)
125.211.510.459.433.71.5025.268.8
223.462.310.960.736.11.6023.964.6
326.6011.610.559.531.81.4028.561.9
424.3715.610.559.033.91.4030.264.4
522.3422.211.359.527.61.4030.670.2
623.8823.510.159.827.91.3032.568.0
724.6529.09.760.131.31.5027.973.5
824.3627.09.859.631.51.5029.568.9
922.1824.410.260.434.11.6032.079.7
1017.4219.014.363.338.11.7045.590.4
1118.1310.514.760.646.61.9050.6103.1
1217.415.816.062.845.21.7043.2108.4
1318.082.215.060.045.41.7036.0107.0
1420.907.312.160.040.41.6044.082.8
1523.429.89.860.338.91.5637.076.5
1614.6716.611.460.535.12.009.090.9
1721.6721.912.260.534.81.7316.081.6
1825.9526.69.958.734.01.629.073.6
1929.7228.010.058.633.81.847.073.4
2030.5627.29.958.433.41.648.072.0
2129.2423.49.459.133.51.6012.074.0
2229.7820.110.558.034.71.5614.073.0
2324.168.612.558.936.01.6515.078.8
2432.304.510.559.838.81.6011.374.0
2533.163.69.057.037.01.6011.074.5
2628.526.010.256.5639.791.577.473.2
2730.948.610.956.339.91.497.076.3
2826.9012.812.151.939.61.507.082.9
2933.3421.110.055.934.81.579.072.1
3028.9024.89.858.332.51.639.569.6
3132.7428.310.056.134.01.579.271.4
3232.1029.79.654.932.31.639.469.8
3332.1924.29.955.632.71.559.368.9

In theory, dijt has infinite number of possible values in the action space, which is unmanageable for the real-world problem addressed, therefore this paper discretizes the action space into finite number of values. Other algorithm parameters are listed in Table 3. As the core part of the proposed method, the accuracy of energy demand prediction will directly affect the whole performance, therefore the results of ADP for on-line and static energy demand prediction are summarized in Table 4. Here the length of prediction horizon is set as 5 time periods, the on-line prediction is implemented by up-to-date data of each time period while the static prediction is based on the data prior to the prediction horizon. It can be seen from the data in Table 3 that the on-line prediction deviations keep in relative lower level than the static method with average values being 3.7512% and 4.1768% respectively. Occasionally, the prediction deviation is larger than 5%, on one hand, this is due to the possibility that more impact factors should be extracted from the practical production and designed as the components of state vector. On the other hand, as a kind of data-based algorithm, the performance of ADP is affected by the size and quality of sample data.

Table 3. Parameter values for ADP algorithm.
Parameterα0KαP0KPεEεLγk
Value0.78×101315000.000010.050.95
Table 4. On-line and static prediction results for steelmaking process.

4.2. Experimental Results of on-line Energy Allocation Based on ADP

The data used for energy allocation experiment comprise of three parts: real-world data, data calculated from real data and random data generated based on the consumption rule. The planning horizon T=15 (hours), the number of processes I=6, the number of energy media J=10. The penalty p ijt s and p ijt e are generated based on the overall consideration of unit energy cost cjt and the priority of meeting demand. Using these data, the on-line energy allocations for 6 processes are carried out. The production plans of these processes along the planning horizon are given by Table 5, where the data has reasonable changes compared with stable real-world data in Table 2 in order to test the performance of the proposed method. In Table 5, the data under S2t represent the air temperature and remain the same values for all the processes. To verify the performance of demand update-based dynamic programming method for energy allocation problem, static programming method is designed, which determines the energy allocation of each time period without considering the impact between adjacent processes. Here static means the energy demand is predicted at the beginning of planning and keeps unchanged along of the horizon, while the demand for on-line framework is updated in each time period.

Table 5. The production plans of all considered processes.
(unit: S1it – ton, S2t –°C)
Time periodSinteringCokingIronmakingSteel makingHot rollingCold rolling
S1itS2tS1itS2tS1itS2tS1itS2tS1itS2tS1itS2t
141.5315.67.0915.623.2015.620.6415.626.7315.69.7015.6
248.6822.26.1822.226.1722.225.1622.227.5522.25.1722.2
341.3723.57.6123.516.0723.531.2523.515.7023.57.2323.5
429.3029.08.7529.018.0629.028.1929.028.3129.06.7729.0
528.0327.09.6127.015.4327.019.2427.019.3127.07.7227.0
637.3424.412.5124.419.9924.430.4524.430.3224.48.0124.4
739.6519.09.3819.020.1919.019.5419.031.2119.08.8919.0
840.5110.59.9110.523.8710.516.3510.530.6410.57.9810.5
941.835.812.465.825.515.823.585.813.545.86.905.8
1042.332.210.932.219.782.221.132.216.262.26.932.2
1136.617.310.007.323.587.321.867.320.447.35.277.3
1254.639.811.869.816.429.833.469.828.389.87.939.8
1328.3316.68.8016.615.7716.630.8416.617.8716.69.3216.6
1428.2721.913.2421.928.4021.914.4321.915.1921.98.1921.9
1537.9526.68.0926.617.7926.626.3926.631.4526.66.2526.6

Tables 6, 7, 8, 9, 10, 11 present the comparison results of energy allocation plans of all processes by the proposed demand update-based on-line method and static method. The differences (marked in bold) between on-line and static frameworks can be seen in these tables. Such differences are caused by the different method of energy demand prediction. Further, different quantity of energy allocated in the prior time period results in different quantity of regeneration which will affect the available amount of energy to be used in the subsequent time period. Recall that energy consumption cost, energy shortage penalty and energy waste penalty are considered in the objective function, then, with limited available quantity of energy, the goal of energy allocation is to seek a comprehensive plan with overall consideration and balance of these objectives. That’s why the quantity of energy allocated to each process is not simply determined by the production yield and in the obtained results the quantity of energy is not proportional to the production yield.

Table 6. Comparison between on-line and static energy allocation for sintering process.
Time periodOn-line resultsStatic results
COG (m3)LSteam (kg)Electricity (kwh)COG (m3)LSteam (kg)Electricity (kwh)
14.0014.2035.704.0014.2035.70
23.9014.8051.503.9014.8051.50
33.8814.8037.003.8814.8037.00
43.7210.6042.003.7210.6042.00
53.7210.6042.003.7210.6042.00
63.7413.2038.803.7413.2038.80
73.8012.2038.253.8012.2038.25
83.758.2044.003.758.2044.00
93.699.5541.503.699.5541.50
103.699.5541.503.699.5541.50
113.7110.7043.503.7110.7043.50
123.9610.7553.503.9610.7553.50
133.7210.6042.003.7210.6042.00
143.7210.6042.003.7210.6042.00
153.7413.2038.803.749.4038.80
Table 7. Comparison between on-line and static energy allocation for coking process.
Time periodOn-line resultsStatic results
BFG (m3)COG (m3)N2 (m3)MS
team (kg)
LS
team (kg)
Elect
ricity (kwh)
BFG (m3)COG (m3)N2 (m3)MS
team (kg)
LS
team (kg)
Elect
ricity (kwh)
1509.6188.3717.5913.153.1526.50510.1187.8717.5913.153.1526.50
2702.29130.1912.7115.702.3529.50496.70112.3812.7115.032.3529.50
3743.00193.5011.0215.824.9129.50542.08193.5011.0216.304.9129.50
4590.50138.506.5216.603.4628.50590.50138.506.5213.933.4628.50
5585.00123.504.3716.003.6228.50585.00123.504.3714.413.6228.50
6474.00200.506.4915.622.8026.00570.50122.502.6414.323.5927.00
7624.50135.004.2515.062.6228.50624.5082.564.2510.722.6228.50
8623.67126.005.0014.543.7427.50407.25143.005.8910.624.7227.00
9603.19144.002.0413.864.9022.50566.39156.502.8413.406.0122.00
10651.00114.503.1414.045.2525.00681.00144.503.8214.606.3823.00
11617.0098.503.3915.123.3828.00513.0093.502.5913.703.9329.50
12616.50151.003.1914.903.1826.00581.67156.502.8413.076.0122.00
13662.00140.506.5915.002.3329.00412.39140.506.5913.092.3329.00
14474.00200.506.4915.622.8026.00570.50122.502.6411.953.5927.00
15482.50163.003.3016.424.1029.00482.50163.003.3015.804.1029.00
Table 8. Comparison between on-line and static energy allocation for ironmaking process.
Time periodOn-line resultsStatic results
BFG (m3)COG (m3)LO2 (m3)HO2 (m3)N2 (m3)LS
team (kg)
Elect
ricity (kwh)
BFG (m3)COG (m3)LO2 (m3)HO2 (m3)N2 (m3)LS
team (kg)
Elect
ricity (kwh)
1701.007.8220.870.6716.387.5454.00701.007.8220.680.6716.387.5454.00
2664.006.7221.720.5823.1012.1656.50664.006.7221.620.5823.1012.1656.50
3709.807.4625.080.9022.349.0056.50715.507.4624.960.9022.349.0056.50
4705.507.1118.900.8022.808.5157.50705.507.1118.780.8022.808.5157.50
5722.007.3223.210.8422.607.9056.00681.747.3223.110.8422.607.9056.00
6705.507.1120.680.8022.808.5157.50605.397.1120.580.8022.808.5157.50
7705.507.1122.790.8022.808.5157.50597.847.1122.690.8022.808.5157.50
8674.507.2324.700.6119.308.8553.00674.507.2324.700.6118.948.8553.00
9674.506.7322.700.6018.507.9753.50513.002.4428.500.4527.508.7656.00
10711.507.6020.520.7921.008.1053.50711.507.6020.170.7921.008.1053.50
11585.076.2223.040.6118.608.9253.50479.896.2222.690.6119.308.9253.50
12715.507.4627.300.9023.609.0056.50715.507.4626.950.9023.609.0056.50
13669.258.1327.620.7023.8014.0055.00708.508.1327.520.7023.8014.0055.00
14626.505.6325.000.5018.6010.9055.00541.015.6325.000.5022.4510.9055.00
15705.507.1125.600.8021.468.5157.50662.207.1125.600.8021.468.5157.50
Table 9. Comparison between on-line and static energy allocation for steelmaking process.
Time periodOn-line resultsStatic results
LDG (m3)HO2 (m3)N2 (m3)Ar (m3)LSteam (kg)Electricity (kwh)LDG (m3)HO2 (m3)N2 (m3)Ar (m3)LSteam (kg)Electricity (kwh)
111.1060.0035.001.4033.9278.0011.1060.0035.001.4033.9278.00
210.0052.7431.501.4627.8269.5010.0052.7431.501.4627.8269.50
39.9052.0037.001.5559.0077.009.9058.5034.001.5520.0073.00
49.8058.0033.501.5921.3874.509.8059.0033.001.5920.7873.50
511.6051.1133.001.4532.7278.0011.6051.1133.001.4532.7278.00
69.9053.8734.001.6620.4074.009.9053.8734.001.6520.0073.00
712.5057.6928.371.5636.9485.0012.5057.6928.981.5636.9485.00
814.3052.0742.001.3638.86100.0014.3052.0742.001.3638.86100.00
911.1060.0037.001.5831.0274.5010.5058.5034.501.6621.2074.00
1012.1058.0539.501.6233.1882.0012.1058.0539.501.6233.1882.00
1111.1060.0037.001.4031.0274.5011.1060.0037.001.4031.0274.50
1210.0056.5038.001.5919.5474.5010.1056.9333.501.6121.9071.00
1310.0054.1536.001.6020.2075.5010.1054.1533.501.6021.9071.00
1414.3061.5042.001.3338.86100.0014.3061.5020.351.3338.86100.00
159.9058.5034.001.6019.0073.509.9059.2234.001.6019.0071.50
Table 10. Comparison between on-line and static energy allocation for hot rolling process.
Time periodOn-line resultsStatic results
BFG (m3)COG (m3)LDG (m3)Electricity (kwh)BFG (m3)COG (m3)LDG (m3)Electricity (kwh)
167.5051.5033.4198.0067.0052.5033.4199.50
270.5049.5026.4998.5070.0051.5021.0598.50
364.0058.5036.30116.0064.0036.7636.30116.00
468.5049.5031.0895.5067.5051.5026.23104.00
561.5056.0028.80112.0065.5056.5024.00113.50
668.5049.5034.7095.5070.0051.5029.2898.50
771.5050.0025.0897.0070.0051.5020.2398.50
860.0049.5023.3095.5069.0048.5017.9195.00
966.0063.0030.29119.0066.0063.0023.88119.00
1066.0063.0030.40119.0039.7321.4929.75119.00
1157.5058.0031.10111.0057.5058.0031.10111.00
1261.5051.0026.4596.5069.0015.0220.9197.00
1357.0065.0025.70114.5057.0065.0025.70114.50
1464.0058.5036.30116.0064.0058.5031.96116.00
1572.5049.5032.7096.5070.0051.5028.5098.50
Table 11. Comparison between on-line and static energy allocation for cold rolling process.
Time periodOn-line resultsStatic results
COG (m3)LO2 (m3)N2 (m3)LSteam (kg)Electricity (kwh)COG (m3)LO2 (m3)N2 (m3)LSteam (kg)Electricity (kwh)
149.500.1029.80149.69224.1249.500.1029.80147.23227.03
249.000.1020.23129.17244.7849.000.1223.23158.00247.64
348.000.1018.13155.00269.6743.930.1218.63158.00261.03
449.000.1026.15158.00237.5428.690.1026.15158.00234.88
549.500.1023.51138.78267.5949.000.1023.51132.05263.46
650.000.1031.13138.88250.6350.000.1030.52133.61247.97
751.500.3014.03203.76226.9715.780.3514.03191.38221.81
854.500.1027.91245.50175.3951.000.3528.38234.50165.32
951.000.3518.84196.91234.6051.000.3518.84189.30235.56
1050.470.3025.62203.92260.0034.710.3525.62196.59257.50
1138.150.2525.21215.16203.8751.000.3528.38203.47218.29
1251.000.1024.16180.50234.7629.650.1029.42169.00238.12
1349.000.1032.41129.26270.5049.000.1031.32124.21269.50
1449.000.0217.18153.15239.5623.950.0217.18146.20238.45
1551.000.1023.52151.62244.4050.500.1923.52152.22242.90

To visualize the merit of on-line allocation framework, the accumulations of the objectives of on-line and static frameworks along the horizon are drew in Fig. 4. It can be observed from Fig. 4 that the two frameworks of the presented processes have almost equal cost at the first period, and the on-line framework exhibits obvious superiority than static framework in almost all the after periods. With respect to the cost accumulations along the horizon, the on-line allocation approach outperforms the static one in each process and the overall plant. The prediction accuracy of ADP has been proved by the experiments using real-world data in Table 4. Therefore, as time progress, the superiority of on-line framework will be increasingly obvious because on-line allocation framework is based on on-line prediction of energy demand and static framework is based on constant demand. We can conclude that by using ADP to interactively learn with time-varying environment and provide accurate energy demand, the energy allocation plan could track the dynamic change of production environment and complete the production task with lower energy cost. The proposed demand update-based dynamic programming method could realize the requirement of on-line energy allocation both in effectiveness and running time.

Fig. 4.

Objective comparison between on-line and static frameworks of all considered processes.

5. Conclusions

This paper studied the energy allocation problem with demand updates. An ADP-based solving method is proposed to deal with the random and time-varying features. The overall methodology is designed as a dynamic programming with each stage determining the optimal allocation plan based on updated energy demand. ADP algorithm is designed to accomplish real-time energy demand prediction, based on which the decision problem at each stage is formulated as a linear programming model that is optimally solved by CPLEX. Compared with static scheme, the proposed on-line energy allocation method shows obvious superiority in effectiveness and stability.

Acknowledgement

This research is partly supported by National Natural Science Foundation of China (Grant No. 71302161, 61374203), National 863 High-Tech Research and Development Program of China (Grant No. 2013AA040704), the Fund for Innovative Research Groups of the National Natural Science Foundation of China (No. 71321001), and State Key Laboratory of Synthetical Automation for Process Industries Fundamental Research Funds (Grant No. 2013ZCX02).

References
 
© 2016 by The Iron and Steel Institute of Japan
feedback
Top