Kurtosis-Based State of Health Prediction of Lithium-Ion Batteries Using Probability Density Function

Yinsen YU; Yongxiang CAI; Wei LIU; Zhenlan DOU; Bin YAO; Bide ZHANG; Qiangqiang LIAO; Zaiguo FU; Zhiyuan CHENG

doi:10.5796/electrochemistry.24-00037

Abstract

Lithium-ion batteries are widely used as power sources for various devices, so rapid and accurate estimation of the health status of lithium-ion batteries is an important means to reduce battery failures. This article conducts charging and discharging experiments on NCA batteries and LFP battery modules. A probability density function based method for predicting the health status of lithium-ion batteries has been proposed. The kurtosis at the peak of the probability density function (PDF) curve of the battery charging voltage was used as input for the model to achieve accurate prediction of battery SOH. The experimental results show that there is a good correlation between this health indicator and battery SOH, with Pearson correlation coefficients greater than 0.96. Therefore, it can be concluded that it can indirectly reflect the current situation of battery SOH and serve as input for the model to further predict SOH. Long short-term memory networks (LSTM) have become a popular deep learning network method for predicting the health status (SOH) of lithium-ion batteries. The LSTM method without optimizing hyperparameters can easily lead to low accuracy in battery SOH prediction models. A modified LSTM method based on Sparrow Search Algorithm (SSA) is proposed for the prediction of State of Health (SOH) in lithium-ion batteries. When the training set only accounts for 20 % of the total data, the root mean square error (RMSE) of LFP battery prediction results is within 0.85 %, and the maximum absolute error (AE) is less than 2.5 %, while the RMSE of NCA battery SOH prediction results is within 0.7 %, and the maximum AE is less than 2.0 %. SSA-LSTM can accurately predict battery SOH under limited training data and has good robustness.

1. Introduction

Environmental pollution and energy crisis have become major issues of common concern for countries around the world today.¹ Therefore, in recent years, many countries have begun to vigorously develop the electric vehicle industry, and lithium-ion batteries are widely used in the field of electric vehicles due to their long lifespan, fast charging, high energy density, and high voltage.²^,³ However, after prolonged use, lithium batteries may experience a decrease in capacity due to aging, which may cause machine malfunctions and even safety accidents.⁴ Based on the above reasons, it is necessary to carry out appropriate safety management and testing of lithium-ion batteries to prevent battery abuse and avoid safety accidents. The Battery management system (BMS) can ensure that the lithium battery can operate in a safe environment.⁵ The core purpose of BMS is to evaluate the current working condition of batteries by monitoring their health status (SOH), state of charge (SOC), and other indicators. An important function of the battery management system is to accurately estimate the health status of the battery, which is crucial for ensuring the driving comfort and safety of electric vehicles.⁶ SOH can be defined as the ratio of the discharge capacity of a battery to the rated capacity of a new battery under certain conditions,⁷ as shown in Eq. 1.

\begin{equation} \textit{SOH} = \frac{Q_{\text{now}}}{Q_{\text{new}}} \times 100\,\% \end{equation}

(1)

There are now three main methods for evaluating the health status of lithium-ion batteries both domestically and internationally: direct measurement methods, model-based methods, and data-driven methods.⁸ The direct measurement method is to measure the data related to battery degradation through electrochemical impedance spectroscopy (EIS), hybrid pulsed power characteristic (HPPC), coulometric counting, and other experiments, and then predict the SOH of the battery.⁹ EIS obtains the internal impedance of the battery through a long-term frequency response method, and the changes in impedance can be used to reflect the changes in the battery’s SOH.¹⁰ HPPC testing is based on DC pulses, where the test curve typically consists of discharge pulses, quiescence, charging pulses, and another quiescence, with each pulse has a defined pulse current amplitude and duration. The purpose is to determine the dynamic power capacity of the battery.¹¹ The Coulomb counting method can calculate the current capacity of the battery through complete charging and discharging, obtaining the SOH of the battery by dividing the current capacity by the rated capacity.¹² These methods have strong adaptability to different batteries and low computational complexity, but due to their high requirements for testing equipment and longer testing time, they are more suitable for laboratory research rather than practical applications.

The purpose of model-based methods is to describe the physical and chemical properties of the battery using mathematical equations. The types of models can mainly be divided into equivalent circuit models (EECM) and electrochemical models (ECM).¹³ Model based algorithm is an indirect algorithm that does not rely on real-time monitoring of battery data but predicts the health status through filtering or intelligent algorithms or recognition of battery characteristic parameters. The equivalent circuit model is a model composed of electrical components, which is close to the dynamic characteristics of a battery system. It is difficult to establish an accurate battery model due to the complex internal principles and uncertain working conditions of the battery.¹⁴ The electrochemical model contains more than thirty parameters, and different model parameters have different sensitivities to terminal voltage. One time identification of all model parameters is time-consuming and imprecise.¹⁵

In recent years, data-driven methods have attracted more and more attention due to the vigorous development of big data and artificial intelligence. The data-driven method has the characteristics of flexibility and universality, and it only relies solely on the operational data of the battery.¹⁶ It abandoned traditional physical models and instead adopted purely mathematical methods. Compared to model-based methods, this technology can adapt to different systems by recalibrating parameters. Typical data-driven models include Gaussian process regression,¹⁷ support vector machine,¹⁸ grey correlation analysis¹⁹ and neural network.²⁰ For data-driven models,²¹ selecting appropriate health indicators as inputs to the model is crucial. The quality of feature indicators determines the accuracy of prediction results. Yang et al.²² extracted four health indicators from the charging curve and verified their strong correlation with SOH through the Pearson correlation coefficient. They also proposed a Gaussian process Regression (GPR) model for predicting battery SOH. Klass et al.¹⁸ applied the features extracted from two battery cells, resistor and a capacitor, to support vector machine (SVM) prediction of SOH. This method has shown good performance in handling power and memory limitations in vehicle applications. In summary, the extraction of health indicators is the basis for whether the model can accurately predict. Therefore, this article extracts a health indicator about curve kurtosis based on the probability density function (PDF)²³ of battery charging voltage. Due to the influence of the loss of lithium inventory (LLI), increase in ohmic internal resistance (ORI), and loss of active substances (LAM) on the shape and size of each peak on the probability density function curve of battery charging voltage, the kurtosis of the curve can well reflect the changes in the internal of the battery, which meets the standards as a health indicator. It has been verified that it has a high mapping reaction relationship with the SOH of the battery, providing assurance for the prediction of the model.

Battery aging is a time process, and the historical data of batteries also contains significant information characteristics. Therefore, recurrent neural network (RNN) algorithms²⁴^,²⁵ that can handle time series data, are widely used for SOH prediction in batteries. However, RNN suffers from gradient explosion and gradient disappearance, so it is difficult to deal with the problem of long-term network dependence. The emergence of Long short-term memory (LSTM) has solved this problem well. Li et al.²⁶ proposed a capacity prediction model based on LSTM, which can predict the battery capacity under actual operating conditions and ensure robustness, even when dealing with input noise. Later, Gao et al.²⁷ constructed a new hybrid framework for LSTM in the performance development of LSTM, which extracts the original data feature information from the original samples through HFCM (Hierarchical Feature Coupled Module), solving the problem of poor prediction accuracy caused by insufficient data extraction in traditional models. Kim et al.²⁸ proposed a variational LSTM with transfer learning to reduce the workload required to collect data cycle of new batteries. Although the effectiveness of LSTM in predicting SOH has been validated, there are still some issues that need to be addressed. The setting of hyperparameters is usually difficult to solve, and the selection of these hyperparameters is usually based on experience. Artificial control often affects the prediction accuracy of the model. The purpose of this article is to improve the accuracy of neural network model prediction results and minimize the amount of input data while ensuring prediction accuracy, so that it can perform better in practical applications. The main contributions are summarized as follows:

(1) Extracting kurtosis based on probability density function as a new battery health indicator. By using the ksdensity function in Matlab software, the battery charging voltage data is transformed into probability density data. The kurtosis of the three peaks on the probability density function curve has a good linear correlation with the battery SOH. These HIs are used as inputs for the battery SOH prediction model.
(2) An improved LSTM network model optimized using Sparrow Optimization Algorithm (SSA) was proposed. By using SSA to optimize the hyperparameters in the LSTM structure, it is proven that the optimal hyperparameter has been found when the fitness value of SSA remains unchanged, and the hyperparameters in the LSTM are replaced. This algorithm improves the prediction accuracy of LSTM for battery SOH.
(3) Through experiments, it was found that when the training set only accounts for 20 % of the total data, the prediction model still maintains high accuracy, with an absolute error (AE) controlled within 2.5 %, and RMSE, MAE, and MAPE values all less than 1.0 %.

The rest of this article is organized as follows. The second section presents the battery data used in the experiment and describes the data collection process. The third section introduces the process of extracting health indicators. The fourth section describes the improvement of the LSTM model by SSA algorithm. The results and discussion of SOH prediction are presented in Section 5. Finally, the paper is summarized in Section 6.

2. Experiments

2.1 Battery specifications and testing system

This article selects two different types of batteries for predictive analysis. The first battery is a 1P8S (composed of 8 identical series connected single batteries) Lithium iron phosphate (LFP) module battery manufactured by BYD manufacturer. The positive and negative electrode materials of the battery are composed of LFP and graphite, and their rated capacity and rated voltage are 200 Ah and 25.6 V respectively. The other is cylindrical lithium-ion battery (NCR21700A, Panasonic manufacturer). On the negative electrode is graphite doped with 3.5 % crystalline silicon particles, and its positive pole is NCA (Li_xNi_0.9Co_0.05Al_0.05O_1.57). Its rated capacity and rated voltage are 4.8 Ah and 3.6 V, respectively.

The LFP battery module is charged at a constant temperature of 25 °C in a constant current (CC) mode of 66.6 A (1/3 C-rate), resulting in the end of the charging process when the voltage reaches 29.2 V. After one hour of rest, it is discharged at 66.6 A (1/3 C-rate) until the voltage reaches 21.6 V. End the experiment when the battery capacity decreases by 35 % (from 200 Ah to 130 Ah), the number of cycles experienced is 1000.

The NCR21700A battery undergoes charging and discharging cycles at room temperature (25 °C). Charge the battery with a current of 2.4 A (1/2 C-rate) until its voltage reaches 4.2 V, and then stop the charging process when the current is less than 0.3 A in constant voltage (CV) mode. Then, let it stand still for one hour before starting to discharge, and use a stable 2.4 A (1/2 C-rate) current to deplete the battery until the predetermined voltage of 2.5 V is reached. When the battery capacity drops to the end of its service life, that is, the capacity decreases by 40 % (from 4.8 Ah to 2.9 Ah), the number of cycles experienced is 300.

2.2 Data collection

The battery module tester (FTV 1-300-100, USA) is used for charging and discharging experiments of LFP modules, and the battery monomer tester (MCV 2-200-5, USA) is used for NCA battery experiments. The Battery management system (BMS) is responsible for collecting voltage, current, time and other test data and transferring the collected data to the computer through the data converter.

2.3 Probability density function (PDF)

The probability density function describes the likelihood of the output value f(x) of the random variable x near a given critical point. The probability P of the random variable x falling within a certain region [a, b] is the integral of the probability density function in this region,²³ and its expression is shown in Eq. 2.

\begin{equation} P = \int_{b}^{a}f(x) dx \end{equation}

(2)

The main reasons for battery capacity degradation include the loss of internal lithium inventory, an increase in ohmic internal resistance, and the loss of active substances. As the battery ages, the voltage plateau of battery charging and discharging will become shorter. Therefore, when using the PDF method to calculate the voltage frequency of charging and discharging, the frequencies of some characteristic voltage ranges can well reflect the internal reaction and aging situation of the battery.

3. Extraction of Health Indicators (HI)

3.1 Analysis of charge and discharge curve

Figure 1 shows the extraction process of HI, with NCA and LFP batteries selected as experimental subjects in this section. Figures 1a–1d shows the charging and discharging curves of the battery. As the number of cycles increases, the voltage plateau gradually increases with the degradation of the battery, and the constant current (CC) charging time of the battery continuously decreases. Time can directly reflect the capacity of the battery to charge during the constant current charging stage, symbolizing the polarization phenomenon inside the battery. As the battery ages, polarization becomes increasingly apparent. In practical applications, batteries are rarely fully charged and discharged. In order to ensure the effectiveness of charging time, the charging time (T₁) for NCA batteries ranging from 4.0 V to 3.6 V and the charging speed (T₂) for LFP batteries ranging from 25 V to 25.8 V are selected as one of the health indicators.²⁹

Figure 1.

Schematic diagram for extracting health indicators. (a), (b), (c), and (d) are the selection ranges for NCA batteries and LFP batteries T₁ and T₂. (e), (h) are the charging voltage PDF curve of NCA battery and LFP battery. (f), (i) are the curvature of each point on the PDF curve of the charging voltage for NCA and LFP batteries. (g), (j) are the position corresponding to the maximum curvature of NCA and LFP batteries on the PDF curve.

3.2 Health indicator extraction based on charging PDF curves

The main reason for using the PDF curve of battery charging voltage as experimental data in this article is that lithium evolution can only occur during the charging stage. Lithium evolution can lead to a decrease in battery performance, such as an increase in internal resistance, a decrease in energy density, and a decrease in charging and discharging efficiency. Therefore, selecting the charging stage voltage as data analysis can more intuitively reflect the process of battery aging. The ksdensity function is used in Matlab software to calculate the probability density distribution of charging voltage. The ksdensity function is a probability density estimation function in MATLAB that can be used to estimate the probability density function of a dataset. This function uses kernel density estimation method to treat each data point in the dataset as a Gaussian kernel, and these Gaussian kernels are weighted and stacked together to obtain probability density estimation. As shown in Eq. 3, where V₁, V₂, …, V_n are the univariate data of charging voltage, n is the sample size, φ is the Gaussian distribution function, h is a smoothing parameter.

\begin{equation} f(x) = \frac{1}{nh}\sum\nolimits^{n}\varphi \left(\frac{x - V_{i}}{h}\right) \end{equation}

(3)

Figures 1e and 1h shows that as the battery ages, the A, B and C peaks on its PDF curve show varying degrees of variation. This phenomenon is caused by the loss of lithium inventory (LLI) inside the battery, the increase in ohmic internal resistance (ORI), and the loss of active substances (LAM). The changes in the A and B peaks of NCA batteries and LFP batteries are mainly caused by the loss of lithium inventory and active substances on the positive electrode, and as the number of cycles increases, the B and C peaks of NCA batteries and LFP batteries also change. This means that battery aging is not only caused by LLI and LAM on the positive electrode, but also by LAM on the negative electrode. It can be found that the various peaks of the battery PDF discharge curve contain characteristics that can reflect battery aging.

Kurtosis, also known as kurtosis coefficient, represents the two characteristic numbers of high and low on an average line. From an intuitive perspective, peak shape is a reflection of the sharpness of curve peaks. The formula for calculating kurtosis can be calculated using Eq. 4.

\begin{equation} K = \frac{\dfrac{1}{n}\displaystyle\sum\nolimits_{i = 1}^{n}(x_{i} - \bar{x})^{4} }{\left[\dfrac{1}{n}\displaystyle\sum\nolimits_{i = 1}^{n}(x_{i} - \bar{x})^{2} \right]^{2}} \end{equation}

(4)

where, K represents kurtosis, x_i represents i th value of the dataset of the probability distribution function, $\bar{x}$ is the mean of the dataset, n is the length of the dataset.

In statistics, kurtosis is a measure of the kurtosis of the probability distribution of real random variables. A high kurtosis indicates an increase in variance, which is caused by extreme differences in low-frequency values that are greater or less than the average. Define the kurtosis of peaks A, B, and C in the PDF curve as K₁, K₂ and K₃.

This article defines the range of a peak as the range between two valleys on either side of the peak. By using Eq. 5 to calculate the curvature of the PDF curve, peaks and valleys were divided. Figure 1d shows the curvature at each point on the PDF curve, with points A, B, and C representing the minimum values corresponding to the peaks in Fig. 1e, and points D, E, and F representing the maximum values corresponding to the valleys in Fig. 1e. Using point O at 3.5 V as the initial point, substitute the voltage probability density function values within the OD, DE, and EF ranges into Eq. 4 to calculate the kurtosis of peaks A, B, and C.

\begin{equation} C = \frac{|y''|}{(1 + y'^{2})^{\frac{3}{2}}} \end{equation}

(5)

where C means curvature, y′ and y′′ represent the first and second derivatives of y with respect to x, respectively.

4. Modeling Algorithms

4.1 Recurrent Neural Network (RNN)

Recurrent Neural Network (RNN) is a neural network that takes sequence data as input and recursively follows its evolutionary direction. RNN can use its predetermined internal memory to process input sequences of various lengths. Elman and Jordan networks, fully cyclic RNN, independent RNN, and LSTM are all variants of RNN.

4.2 Long Short-Term Memory (LSTM)

LSTM was proposed by Hochreiter and Schmidhuber³⁰ in 1997 and is a special type of recurrent neural network designed to address the long-term dependency problem of general RNNs. It can effectively solve the problem of vanishing backpropagation gradients in RNN networks. The LSTM network structure is shown in Fig. 2, and its storage unit is controlled by structures such as the forget gate (f_t), input gate (i_t), and the output gate (O_t). The task of Forget gate is to accept a Long-term memory $C'_{t - 1}$, it is composed of the information output from the previous unit module and determines which parts of $C'_{t - 1}$ should be retained or forgotten. The function of the input gate is to determine which new information is stored in the unit module for updating the current information. The output gate uses a sigmoid function to determine the desired output unit state, and then processes the unit state through the tanh layer, multiplying the two to obtain the final required output information. The update formula for the information of the three gates structure is shown in Eqs. 6–11:

\begin{equation} f_{\text{t}} = \sigma (W_{\text{f}}[h_{\text{t}-1}, x_{\text{t}}] + b_{\text{f}}) \end{equation}

(6)

\begin{equation} i_{\text{t}} = \sigma (W_{\text{i}}[h_{\text{t}-1},x_{\text{t}}] + b_{\text{i}}) \end{equation}

(7)

\begin{equation} C_{\text{t}}' = \tanh (W_{\text{c}}[h_{\text{t}-1},x_{\text{t}}] + b_{\text{c}}) \end{equation}

(8)

\begin{equation} C_{\text{t}} = f_{\text{t}}C_{\text{t}-1} + i_{\text{t}}C_{\text{t}}' \end{equation}

(9)

\begin{equation} O_{\text{t}} = \sigma (W_{\text{o}}[h_{\text{t}-1},x_{\text{t}}] + b_{\text{o}}) \end{equation}

(10)

\begin{equation} h_{\text{t}} = O_{\text{t}}\tanh (C_{\text{t}}) \end{equation}

(11)

where W and b represent the corresponding weight and deviation values; x_t represents the input value; t represents the current time; t − 1 represents the previous moment; h represents the hidden state; i_t represents the output of the input gate at time t; C_t is the current storage unit state; f_t and O_t is the output of the forgetting gate and the output gate at the current time; σ is an sigmoid function, and tanh is the activation function.

Figure 2.

LSTM network structure.

4.3 Sparrow Search Algorithm (SSA)

The Sparrow Optimization Algorithm, as a new type of metaheuristic algorithm, was proposed by Xue and Shen in 2020.³¹ This algorithm is a swarm intelligence algorithm with strong global optimization ability and fewer adjustable parameters. This design is inspired by the foraging and anti-predatory behavior of sparrow populations. The inspiration for this algorithm comes from the foraging and anti-predatory behavior of sparrow populations. This model consists of three parts: “Explorer”, “Follower”, and “Warning”. The task of the “Explorer”, “Follower”, and “Warning” is to seek the overall optimal solution. The explorer has a strong search ability and high adaptability. It is responsible for finding food and providing foraging areas and directions for the entire sparrow population. Followers have high adaptability and compete with explorers for food to increase their predation rate. Followers and explorers can switch to each other under certain conditions. Early warning personnel discover danger and issue timely warning signals in a timely manner. When the alarm value exceeds the preset safety threshold, it indicates that the population is aware of the threat from predators. The population will engage in anti-predatory behavior, and the position of the population will change. The sparrows outside the population will continuously adjust their position, and the sparrows inside the population will approach the neighboring sparrows.

Assuming the total number of sparrows is n, the position of each sparrow in the D-dimensional space can be described as Eq. 12:

\begin{equation} V = \begin{bmatrix} V_{1,1} & V_{1,2} & \cdots & \cdots & V_{1,D} \\ V_{2,1} & V_{2,2} & \cdots & \cdots & V_{2,D} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ V_{n,1} & V_{n,2} & \cdots & \cdots & V_{n,D} \end{bmatrix} \end{equation}

(12)

Therefore, the fitness of all sparrows can be expressed as Eq. 13:

\begin{equation} f = \begin{bmatrix} f([V_{1,1} & V_{1,2} & \cdots & \cdots & V_{1,D}]) \\ f([V_{1,1} & V_{1,2} & \cdots & \cdots & V_{1,D}]) \\ & & \vdots & & \\ f([V_{n,1} & V_{n,2} & \cdots & \cdots & V_{n,D}]) \\ \end{bmatrix} \end{equation}

(13)

Formula for updating Explorer’s position after iteration is shown in Eq. 14:

\begin{equation} V_{i,C}^{t + 1} = \begin{cases} Q \cdot \exp \left(\dfrac{V_{\textit{worst}}^{t} - V_{i,C}^{t}}{i{}^{2}}\right) & \textit{if $i > \dfrac{2}{n}$} \\ V_{B}^{t + 1} + | V_{i,C}^{t} - V_{B}^{t + 1} | \cdot A^{ + } \cdot L & \textit{if $i \leq \dfrac{2}{n}$} \end{cases} \end{equation}

(14)

where t indicates the current iteration, P^t_i_,_c represents the position of i-th sparrow in j dimension when iterating t times, Q is a random number obeying the normal distribution. V_B represents the optimal position that the current explorer can occupy, V^t_worst represents the current worst-case position, where A is a 1 × d matrix randomly assigned by 1 and −1, $A^{ + } = A^{T}( AA^{T} )^{ - 1}$.

4.4 SSA-optimized LSTM

The SSA algorithm has the characteristics of fast convergence speed and strong optimization ability when solving optimization problems. On this basis, we focused on studying the hyperparameter optimization problem in network models to minimize the impact of human interference on the network model and enhance its predictive ability. On this basis, this article takes learning rate, time step, number of neurons in the LSTM layer, and epoch as objective optimization parameters. Using SSA to optimize these hyperparameter of LSTM, it is proven through multiple iterations that SSA has found the optimal parameters in the LSTM model when the fitness value of SSA remains unchanged. The overall framework of the model is shown in Fig. 3.

Figure 3.

The SOH prediction process of lithium-ion battery based on SSA-LSTM.

4.5 The prediction criteria of SOH prediction

To verify the accuracy of the SSA-LSTM method, mean absolute error (MAE), root-mean-square deviation (RMSE), mean relative percentage error (MAPE) and absolute error (AE) are selected as evaluation criteria in this paper. MAE represents the average of the absolute difference between the predicted SOH value of the actual value and the predicted value, used to describe the general performance of the method. MAPE is used to measure the relative percentage error between the predicted value of SOH and the actual value. RMSE and absolute error (AE) represent the deviation between the predicted value and the actual value. The closer the four indicators are to zero, the higher the accuracy of the proposed model. The calculation formulas for MAE, RMSE, MAPE and AE are as Eqs. 15–18.

\begin{equation} \textit{MAE} = \frac{1}{n} \times \sum\nolimits_{i = 1}^{n}| Y_{i}-Y_{i}'| \end{equation}

(15)

\begin{equation} \textit{RMAE} = \sqrt{\frac{1}{n} \times \sum\nolimits_{i = 1}^{n}(|Y_{i}-Y_{i}'|)^{2}} \end{equation}

(16)

\begin{equation} \textit{MAPE} = \frac{1}{n} \times \sum\nolimits_{i = 1}^{n}\left|\frac{Y_{i}-Y_{i}'}{Y_{i}}\right| \end{equation}

(17)

\begin{equation} \textit{AE} = Y_{i}' - Y_{i} \end{equation}

(18)

where n is the number of samples, Y_i is the real value, Y_i′ is the predicted value.

5. Results and Discussion

5.1 Cycling degradation of batteries

Figures 4a–4d shows the V-Q curves of the charging and discharging of two types of batteries. After completing 300 cycles, the SOH of NCA batteries decreased from 100 % to 60 %. After 1000 cycles, the SOH of LFP batteries decreased from 100 % to 65 %. As the battery ages and degrades, the V-Q curve of charge and discharge gradually decreases, which means that the internal resistance of the battery increases when the SOH of the battery decreases. The charging V-Q curve of NCA batteries has two stages: constant current and constant voltage, while the charging V-Q curve of LFP module batteries only has a constant current stage. The reason is that the module battery is formed by connecting 8 individual batteries in series, and there is a resistance at the serial interface of each two batteries. When the current passes, the resistance occupies a part of the voltage, which makes the battery voltage from reaching the theoretical value, which is the preset voltage in the testing program. Therefore, the V-Q curve of LFP module batteries only exists in a constant current stage.

Figure 4.

V-Q curves of charge and discharge for NCA and LFP batteries.

5.2 The correlation between HI and SOH

The Pearson correlation coefficient is used to calculate the correlation between each HI and SOH, and its results can determine whether the HI can be used as input to the model for predicting SOH. The Pearson correlation coefficient can be calculated using Eq. 19.

\begin{equation} \rho = \frac{\displaystyle\sum\nolimits_{i = 1}^{n}(D_{i}-\bar{D})(E_{i}-\bar{E}) }{\sqrt{\displaystyle\sum\nolimits_{i = 1}^{n}(D_{i}-\bar{D})^{2}\displaystyle\sum\nolimits_{i = 1}^{n}(E_{i}-\bar{E})^{2}}} \end{equation}

(19)

Among them, D_i is the HI sequence, E_i is the sequence of battery SOH, $\bar{D}$ and $\bar{E} $ is their average value. The range of ρ is [−1, 1]. The higher the correlation between HI and battery SOH, the closer the absolute value of ρ is to 1. When it approaches zero, there is a weak linear relationship between HI and battery capacity. The calculation results are shown in Table 1. The Pearson correlation values between the five extracted HI and SOH are all greater than 0.96, indicating a good correlation between the actual SOH of the battery and the His.

Table 1. Correlation between various HI and SOH.

Battery type	HI	Pearson correlation coefficient
LFP	K₁	0.9760
	K₂	0.9750
	K₃	−0.9858
	T₁	0.9947
	T₂	0.9909
NCA	K₁	0.9643
	K₂	−0.9817
	K₃	−0.9795
	T₁	0.9940
	T₂	0.9816

Figure 5 shows the fitting results of different HIs and SOH for two types of batteries. Figures 5a–5c and 5f–5h show the fitting results of the peak kurtosis of the charging voltage PDF curves A, B, and C of NCA and LFP batteries with SOH, respectively. Figures 5d, 5e and 5i, 5j show the fitting results of the constant current charging time and the constant current charging time within the range of constant discharge voltage with SOH. Due to the different materials of the positive electrode of the battery, the variation trend of its PDF curve is also different. The B peak of NCA battery gradually increases with the aging of the battery, while the B peak of LFP battery gradually decreases. The main reason is that the change of the B peak of NCA battery is caused by the LAM on the negative electrode, while the change of the B peak of LFP battery is caused by the LLI on the positive electrode. Therefore, the kurtosis fitting results of the two types of batteries have different linear relationships. Figure 5a shows the fitting graph between the kurtosis of peak A of NCA battery and SOH, which is linearly positively correlated with SOH. Figures 5b and 5c show the fitting results of the kurtosis of the B and C peaks of NCA batteries with SOH, which are linearly negatively correlated with SOH. Similarly, according to Figs. 5f and 5g, the kurtosis of peaks A and B in LFP batteries is linearly positively correlated with SOH, while the kurtosis of peak C is linearly negatively correlated with SOH. Due to the direct impact of battery capacity on the charging and discharging time of the battery, Figs. 5d, 5e and 5i, 5j also confirm this viewpoint. The constant current charging time of the battery and the constant current charging time of the discharge constant voltage range are linearly positively correlated with SOH. This chapter verifies the strong correlation between five HIs and battery SOH through Pearson correlation coefficients and fitting results. Therefore, these five HIs can be used as inputs for the model to further predict the future trend of battery SOH.

Figure 5.

Linear fitting results between different HI and SOH. (a–e): K₁ (K₂, K₃, T₁, T₂)-SOH models for NCA battery; (f–j): K₁ (K₂, K₃, T₁, T₂)-SOH models for LFP battery.

5.3 SOH prediction

To reflect the optimization effect, this section will conduct a scientific control on the LSTM model and SSA-LSTM model, select 60 %, 40 % and 20 % of the overall data set as the training data set to train the model, and compare the remaining data with the prediction results to verify the accuracy of the SOH model.

The prediction results of the two types of batteries are shown in Fig. 6. Figures 6a–6c and 6g–6i are both predicted by the LSTM network model. Figures 6d–6f and 6j–6l show the prediction results of the LSTM network model optimized by SSA. Therefore, it can be seen more intuitively that the SOH predicted by SSA-LSTM can better track the actual SOH value of the battery, and this trend can also be predicted when the SOH value suddenly changes at certain times. These sudden changes are usually due to the capacity rebound of the battery after a period of rest during the testing of battery aging data.³² This proves that LSTM neural networks have strong tracking ability and are very suitable for processing time series data. When the training set is relatively small, as shown in Figs. 6c and 6i, the predicted results will have a significant deviation from the actual values. The reason is that during the model training process, the hyperparameters of the LSTM model are only meet the data in the training set and have poor compatibility with the test data. After the optimization of the SSA algorithm, each hyperparameter reaches the global optimal solution, so the predicted results that are closer to the real value.

Figure 6.

Prediction performance of two types of batteries at training set ratios of 60 %, 40 % and 20 %. (a–c) is the prediction result of the LSTM model for NCA battery capacity, and (d–f) is the prediction result of the SSA-LSTM model for NCA battery capacity. (g–i) is the prediction result of the LSTM model for LFP battery capacity, and (j–l) is the prediction result of the SSA-LSTM model for LFP battery SOH.

Table 2 contains the RMSE, MAE, and MAPE values for the predicted SOH for the LSTM model and SSA-LSTM model under different proportions of training data for two types of batteries. It can be seen from Table 2 that the RMSE, MAE, and MAPE values of the SSA-LSTM model are smaller than the RMSE, MAE, and MAPE values of the LSTM model, and the prediction error of the SSA-LSTM model is smaller than the prediction error of the LSTM model. The effect of hyperparameter optimization has been fed back to the error value. The SSA-LSTM model has better prediction accuracy. Especially when the training set proportion is 40 % and 20 %, the error of SSA-LSTM is significantly better than that of LSTM. Therefore, we can conclude that the accuracy improvement is more significant when the proportion of the training set is smaller.

Table 2. RMSE, MAE and MAPE values of predicted SOH of NCA and LFP batteries at different training set proportions using different model algorithms.

Battery type	Model	Prediction criteria	Proportion of training set
Battery type	Model	Prediction criteria	60 %	40 %	20 %
NCA	LSTM	RMSE (%)	0.3237	0.7335	1.212
		MAE (%)	0.2639	0.6424	1.031
		MAPE (%)	0.3913	0.9208	1.4198
	SSA-LSTM	RMSE (%)	0.2916	0.3772	0.6965
		MAE (%)	0.2346	0.3032	0.5941
		MAPE (%)	0.3499	0.4368	0.8226
	BP	RMSE (%)	0.3452	1.2569	2.1572
		MAE (%)	0.3033	1.1670	2.0431
		MAPE (%)	0.4346	1.5869	2.6082
LFP	LSTM	RMSE (%)	0.4318	0.7654	1.5422
		MAE (%)	0.3651	0.6894	1.3891
		MAPE (%)	0.3589	0.9458	1.5495
	SSA-LSTM	RMSE (%)	0.3099	0.4830	0.8297
		MAE (%)	0.2375	0.3898	0.7180
		MAPE (%)	0.3536	0.4757	0.8489
	BP	RMSE (%)	0.4099	1.6263	2.0638
		MAE (%)	0.3375	1.2667	1.6981
		MAPE (%)	0.4030	1.4153	1.8234

Figure 7 represents the absolute error diagram for a clearer and more intuitive understanding of the prediction error. When the training set of NCA battery is 60 %, the prediction AE of LSTM and SSA-LSTM are less than 1.0 %. But when the proportion of the training set is 40 %, the AE of SSA-LSTM prediction results can still be maintained within 1.0 %, while the AE of LSTM can only ensure that it does not exceed 2.0 %. When the proportion of the training set proportion is 20 %, the AE of LSTM has exceeded 2.0 %, and most predictions have errors between 1.0 % and 2.0 %. The prediction error of SSA-LSTM is mostly controlled within 1.0 %, with a few points between 1.0 % and 2.0 %. The AE comparison of LFP battery prediction results is more obvious. When the training set is 60 %, the AE of SSA-LSTM prediction is within 1.0 %, and the AE of LSTM prediction is within 2.0 %. When the training set is 40 %, the AE of SSA-LSTM remains within 2.0 %, while the error of LSTM has exceeded 2.0 % and is controlled within 3.0 %. When the training set is 20 %, the AE of LSTM exceeds 3.0 %, while the AE of SSA-LSTM can be controlled within 2.0 %. Only a few points slightly exceed this range, and the overall absolute error is within 2.5 %. The above results indicate that the prediction accuracy has been significantly improved by optimizing the hyperparameters within LSTM using SSA. Through experiments on NAC battery data and LFP battery data, this model achieved accurate predictions of SOH for both types of batteries, indicating that the accuracy, robustness, and generalization of this method can meet practical needs.

Figure 7.

Absolute error of battery training sets with different proportions. (a–f) shows the absolute errors of 60 %, 40 %, and 20 % of the total NCA battery data for the LSTM network and SSA-LSTM network in the training set. (g–l) represents 60 %, 40 %, and 20 % of the total LFP battery data, and represents the absolute errors of the LSTM network and SSA-LSTM network in the training set.

5.4 Comparison of different SOH prediction methods

To further evaluate the performance of the SSA-LSTM model proposed in this article, this section compares the prediction results of the BP neural network with those of the SSA-LSTM model. The prediction results using SSA-LSTM are superior to those using BP neural networks. Specifically, for the two battery data sets, the MAE, RMSE, and MAPE of SSA-LSTM are all smaller than the predicted results of BP. From Fig. 8 and Table 2, it can be analyzed that when the training set accounts for a large proportion, although the prediction error of the SSA-LSTM model is smaller than that of the BP neural network, the accuracy advantage displayed is relatively small, and the BP neural network can also accurately predict the actual SOH. However, when the training data volume is only 20 % of the total data, there will be significant errors in BP’s prediction results, while SSA-LSTM can still accurately predict the actual SOH of the battery. Therefore, it can be concluded that the accuracy of LSTM has improved after SSA optimization, and its accuracy advantage is greater compared to other network models when the training data volume is small. Table 3 shows the prediction results of SOH using different methods in the literature. Compared to the prediction errors in the literature, the SSA-LSTM method mentioned in this article has smaller errors, higher prediction accuracy, and can predict more data with less data volume.

Figure 8.

Comparison of SSA-LSTM model and BP neural network prediction results at training set ratios of 60 %, 40 % and 20 %.

Table 3. Prediction accuracy of models in different literature at 60 % training set.

Reference	Battery type	Model	MAE (%)	RMSE (%)	MAPE (%)	AE (%)
Gong's²⁹	LiNiCoAlO₂	PSO-LSTM	2.3350	0.7860	—	<3.0
Ma's³³	LiNiCoAlO₂	DEGWO-LSTM	0.3598	0.4055	0.4886	—
Xu's³⁴	LiNiCoAlO₂	CNN-LSTM-Skip	0.3300	0.3700	—	<4.0
Lan's³⁵	LiCoO₂	GPR	0.7820	1.6380	—	<2.0
This work	LiFePO₄	SSA-LSTM	0.2375	0.3099	0.3536	<1.0
This work	LiNiCoAlO₂	SSA-LSTM	0.2346	0.2916	0.3499	<1.0

6. Conclusions

The kurtosis of the three peaks on the PDF curves with partial charge discharge time as HIs and long-term and short-term memory networks optimized by sparrow search algorithm are constructed for accurate SOH prediction of NCA and LFP batteries using a low proportion of training datasets. The conclusions are drawn as follow.

(1) Such HIs as the kurtosis of the three peaks on the PDF curves and partial charge discharge time respectively have good linear correlations with battery SOH. These HIs serve as inputs to the battery SOH prediction model.
(2) The long-term and short-term memory networks optimized by sparrow search algorithm can effectively select optimal hyperparameters to solve the problem of high SOH prediction accuracy in a small number of training sets.
(3) The RMSE, MAE and MAPE of SOH prediction are less than 1 % at a training set proportion as low as 20 % using the proposed method while the absolute error of NCA batteries is less than 2 % and that of LFP batteries is below 2.5 %.

List of Abbreviations

absolute error

back propagation

BMS

battery management system

constant current

constant voltage

ECM

electrochemical models

EECM

equivalent circuit models

EIS

electrochemical impedance spectroscopy

GPR

Gaussian process regression

health indicators

HPPC

hybrid pulsed power characteristic

ICA

incremental capacity analysis

Kurtosis

LAM

loss of active substances

LFP

lithium iron phosphate

LIBs

lithium-ion batteries

LLI

lithium inventory

LSTM

long short-term memory

MAE

mean absolute error

MAPE

mean relative percentage error

NCA

nickel-cobalt-aluminum

ORI

ohmic resistance

PDF

probability density function

RMSE

root mean squared error

RNN

recurrent neural network

SOC

state of charge

SOH

state of health

SSA

sparrow search algorithm

SVM

support vector machine

List of Symbols

i_t

input gate at the current time

f_t

forgetting gate at the current time

O_t

output gate at the current time

Q_actual

available capacity of an aged battery

Q_initial

rated capacity of a new battery

Pearson correlation coefficient

D_i

HI sequence

E_i

sequence of battery SOH

Voltage

probability

capacity

Acknowledgments

This work was sponsored by the Science and Technology Support Program of Guizhou Province ([2022] General 15, [2022] General 12) and Shanghai Key Laboratory of Materials Protection and Advanced Materials in Electric Power, China.

CRediT Authorship Contribution Statement

Yinsen Yu: Conceptualization (Lead), Data curation (Lead), Methodology (Lead), Software (Lead), Validation (Lead), Writing – original draft (Lead)

Yongxiang Cai: Data curation (Equal), Resources (Equal)

Wei Liu: Data curation (Supporting), Methodology (Supporting), Validation (Supporting)

Zhenlan Dou: Conceptualization (Equal), Resources (Equal), Validation (Supporting)

Bin Yao: Conceptualization (Supporting), Formal analysis (Equal), Software (Equal)

Bide Zhang: Data curation (Equal), Software (Supporting)

Qiangqiang Liao: Conceptualization (Equal), Funding acquisition (Lead), Methodology (Equal), Supervision (Lead), Writing – review & editing (Lead)

Zaiguo Fu: Data curation (Supporting), Supervision (Equal)

Zhiyuan Cheng: Project administration (Equal), Resources (Equal)

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding

Science and Technology Support Program of Guizhou Province: [2022] General 15, [2022] General 12

Shanghai Key Laboratory of Materials Protection and Advanced Materials in Electric Power, China

References

1) J. Xiong and D. Xu, Environ. Res., 194, 110718 (2021).
2) M. S. H. Lipu, M. A. Hannan, H. Aini, M. M. Hoque, P. J. Ker, M. H. M. Saad, and A. Afida, J. Clean. Prod., 205, 115 (2018).
3) R. Kumar, R. K. Singh, A. V. Alaferdov, and S. A. Moshkalev, Electrochim. Acta, 281, 78 (2018).
4) L. Chen, Y. Zhang, Y. Zheng, X. Li, and X. Zheng, Neurocomputing, 414, 245 (2020).
5) Z. Wei, J. Zhao, R. Xiong, G. Dong, and K. J. Tseng, IEEE Trans. Ind. Electron., 66, 5724 (2019).
6) X. Hu, L. Xu, X. Lin, and M. Pecht, Joule, 4, 24 (2020).
7) X. Hu, F. Feng, K. Liu, L. Zhang, J. Xie, and B. Liu, Renewable Sustainable Energy Rev., 114, 109334 (2019).
8) S. Jiang and Z. Song, J. Power Sources, 517, 230710 (2022).
9) M. Galeotti, L. Cinà, C. Giammanco, S. Cordiner, and A. D. Carlo, Energy, 89, 678 (2015).
10) Y. Cui, P. Zuo, C. Du, Y. Gao, J. Yang, X. Cheng, Y. Ma, and G. Yin, Energy, 144, 647 (2018).
11) J. Sun and J. Kainz, J. Energy Storage, 70, 108034 (2023).
12) S. Zhang, X. Guo, X. Dou, and X. Zhang, J. Power Sources, 479, 228740 (2020).
13) Z. Wang, G. Feng, D. Zhen, F. Gu, and A. Ball, Energy Rep., 7, 5141 (2021).
14) H. Tian, P. Qin, K. Li, and Z. Zhao, J. Clean. Prod., 261, 120813 (2020).
15) W. Li, D. Cao, D. Jöst, F. Ringbeck, M. Kuipers, F. Frie, and D. U. Sauer, Appl. Energy, 269, 115104 (2020).
16) X. Li, L. Ju, G. Geng, and Q. Jiang, Energy, 274, 127378 (2023).
17) K. Liu, X. Tang, R. Teodorescu, F. Gao, and J. Meng, IEEE Trans. Energ. Convers., 37, 1282 (2022).
18) V. Klass, M. Behm, and G. Lindbergh, J. Power Sources, 270, 262 (2014).
19) W. Qu, W. Shen, and J. Liu, J. Energy Storage, 42, 103102 (2021).
20) Y. Zhang and Y. Li, Renewable Sustainable Energy Rev., 161, 112282 (2022).
21) Z. Hou, W. Xu, G. Jia, J. Wang, and M. Cai, J. Electrochem. Soc., 171, 020550 (2024).
22) D. Yang, X. Zhang, R. Pan, Y. Wang, and Z. Chen, J. Power Sources, 384, 387 (2018).
23) X. Feng, J. Li, M. Ouyang, L. Lu, J. Li, and X. He, J. Power Sources, 232, 209 (2013).
24) K. Zaporojets, G. Bekoulis, J. Deleu, T. Demeester, and C. Develder, Expert Syst. Appl., 174, 114704 (2021).
25) C. Lin, X. Tuo, L. Wu, G. Zhang, and X. Zeng, Batteries, 10, 71 (2024).
26) W. Li, N. Sengupta, P. Dechent, D. Howey, A. Annaswamy, and D. U. Sauer, J. Power Sources, 482, 228863 (2021).
27) M. Gao, Z. Bao, C. Zhu, J. Jiang, Z. He, Z. Dong, and Y. Song, Energy Rep., 9, 2577 (2023).
28) S. Kim, Y. Y. Choi, K. J. Kim, and J. Choi, J. Energy Storage, 41, 102893 (2021).
29) Y. Gong, X. Zhang, D. Gao, H. Li, L. Yan, J. Peng, and Z. Huang, J. Energy Storage, 53, 105046 (2022).
30) I. Hazra and M. D. Pandey, Nucl. Eng. Des., 386, 111563 (2022).
31) J. Xue and B. Shen, Chem. Eng. Sci., 8, 22 (2020).
32) X. Pang, R. Huang, J. Wen, Y. Shi, J. Jia, and J. Zeng, Energy, 12, 2247 (2019).
33) Y. Ma, C. Shan, J. Gao, and H. Chen, Energy, 251, 123973 (2022).
34) H. Xu, L. Wu, S. Xiong, W. Li, A. Garg, and L. Gao, Energy, 276, 127585 (2023).
35) H. H. Goh, Z. Lan, D. Zhang, W. Dai, T. A. Kurniawan, and K. C. Goh, J. Energy Storage, 50, 104646 (2022).

Corresponding author

Version information

Funder information

1.Fund name: Science and Technology Support Program of Guizhou Province

2.Fund name: Science and Technology Support Program of Guizhou Province

3.Fund name: Shanghai Key Laboratory of Materials Protection and Advanced Materials in Electric Power, China

Register with J-STAGE for free!