Lithium battery remaining life prediction method and system based on mvmd-transformer-bilstm fusion framework

By integrating the MVMD and Transformer-BiLSTM framework, multi-scale decomposition and long-term feature capture of lithium battery capacity signals are achieved, solving the problem of model parameter calibration in lithium battery life prediction, improving prediction accuracy and generalization ability, and making it suitable for battery management in new energy vehicles and energy storage systems.

CN122307368APending Publication Date: 2026-06-30XINJIANG UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
XINJIANG UNIVERSITY
Filing Date
2026-04-07
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing lithium battery life prediction methods struggle to accurately calibrate model parameters under complex operating conditions, and existing hybrid models fail to fully leverage the complementary advantages of multivariate signal decomposition and deep learning networks, resulting in insufficient prediction accuracy and generalization ability.

Method used

A multivariate variational mode decomposition (MVMD) and Transformer-BiLSTM fusion framework is adopted. The lithium battery capacity signal is decomposed into multiple stationary sub-components through MVMD, and the Transformer is used to capture long-term time dependence and BiLSTM to enhance local dynamic features. Combined with the parameter adjustment of the decay stage and the noise robustness design, accurate prediction of the remaining life of lithium batteries is achieved.

Benefits of technology

It improves the prediction accuracy and generalization ability of lithium batteries under dynamic operating conditions, can accurately characterize complex aging processes, adapt to the degradation pattern of the entire life cycle, and meet the needs of actual industrial scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122307368A_ABST
    Figure CN122307368A_ABST
Patent Text Reader

Abstract

This paper presents a method and system for predicting the remaining life of lithium batteries based on the MVMD-Transformer-BiLSTM fusion framework, belonging to the technical field of battery life prediction. The method decomposes the battery capacity time series into multiple stationary sub-modes using MVMD, reducing the complexity of the original signal. Then, it utilizes the self-attention mechanism of Transformer to capture long-distance dependencies in the sequence and combines it with a BiLSTM network to enhance the bidirectional propagation capability of time series features, constructing an end-to-end prediction model. Finally, it trains the model using the Adam optimizer to achieve accurate prediction of battery capacity degradation trends and remaining life. Using NASA's public datasets B0006 and B0018 batteries as research objects, the method employs mean squared error (MSE), root mean square error (RMSE), and coefficient of determination (R²). 2 The performance was evaluated based on seven indicators, including [list of indicators]. Experimental results show that this invention can effectively capture the battery capacity degradation pattern, and its prediction accuracy and generalization ability are significantly better than traditional single models, providing reliable technical support for lithium battery health management.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of lithium-ion battery technology and relates to a method and system for predicting the remaining lifetime of lithium batteries based on the MVMD-Transformer-BiLSTM fusion framework. Background Technology

[0002] Lithium-ion batteries, with their advantages of high energy density, long cycle life, and environmental friendliness, have been widely used in key fields such as new energy vehicles, aerospace, and energy storage systems. Their capacity decay process is influenced by multiple factors, including material aging, charge / discharge rate, and ambient temperature, exhibiting significant nonlinear and non-stationary characteristics. Accurately predicting the remaining life (RUL) of lithium-ion batteries is a core technological support for ensuring safe equipment operation, optimizing maintenance strategies, and reducing operating costs, and has significant engineering application value and academic research significance.

[0003] Currently, lithium battery life prediction methods are mainly classified into three categories: model-driven, data-driven, and hybrid-driven. Model-driven methods construct degradation models based on the internal electrochemical mechanisms of the battery, but the model parameters are difficult to calibrate accurately under complex operating conditions. Data-driven methods achieve prediction by mining the mapping relationship between monitoring data and lifespan. Among them, models such as recurrent neural networks (RNN) and LSTM are widely used in time series data processing, but LSTM has limited ability to capture long sequence dependencies. Transformer models, with their self-attention mechanism, overcome the length limitation of time series modeling and show advantages in sequence prediction tasks, but they are not adaptable to non-stationary signals.

[0004] Signal decomposition techniques can break down complex non-stationary signals into multiple stationary sub-components, improving model processing efficiency. Multivariate variational mode decomposition (MVMD), as an improved signal decomposition method, possesses strong noise resistance and high decomposition accuracy, and has been applied in fault diagnosis. However, its fusion application in lithium battery life prediction still requires further research. Existing hybrid models often combine a single decomposition technique with a single deep learning network, failing to fully leverage the complementary advantages of different models, and thus their prediction accuracy and generalization ability need improvement.

[0005] Therefore, how to construct a prediction framework that can fully integrate the advantages of multivariate signal decomposition and deep sequence modeling capabilities to accurately characterize the complex aging process of lithium batteries and improve the prediction accuracy and generalization under dynamic operating conditions has become a key issue that needs to be addressed in current lithium battery life prediction technology, and it is also the original intention of this patent solution. Summary of the Invention

[0006] To address the aforementioned issues, this paper proposes a method and system for predicting the remaining lifetime of lithium batteries based on the MVMD-Transformer-BiLSTM fusion framework.

[0007] To achieve the above objectives, the present invention provides the following technical solution: a lithium battery remaining lifetime prediction method based on the MVMD-Transformer-BiLSTM fusion framework, comprising the following steps:

[0008] S1: Time-series data preprocessing: Collect time-series data of lithium battery cycle discharge capacity, construct a sample set and divide it into training and test sets; reconstruct the time-series data using the sliding window method; normalize the input and output data using the mapminmax function;

[0009] S2: Multivariate Variational Mode Decomposition (MVMD) Processing:

[0010] Set the MVMD decomposition parameters as follows: penalty factor α=2000, noise margin τ=0, number of decomposed modes K=10, no DC component, tolerance tol=1e-10;

[0011] Perform MVMD decomposition on the normalized capacity time-series signal to obtain the intrinsic mode function (IMF) components;

[0012] MVMD decomposition achieves synchronous decomposition of multi-channel signals by solving a variational optimization problem. Its objective function and constraints are as follows:

[0013]

[0014] Minimize the objective function:

[0015]

[0016] Where, C: the total number of signal channels; K: the total number of modes to be decomposed; f(c)(t): the original input signal of the c-th channel; uk(c)(t): the k-th mode component of the c-th channel; ωk: the common center frequency of the k-th mode; δ(t): the Dirac Delta function; j: the imaginary unit; : Convolution operation; ∂t: Partial derivative with respect to time;

[0017] S3: Transformer-BiLSTM hybrid network training:

[0018] S3.1 Construct a fusion prediction network, which sequentially includes a sequence input layer, a position embedding layer, a self-attention layer, a BiLSTM layer, a regularization layer, a fully connected layer, and a regression output layer; wherein, the position embedding layer is configured with position codes of a maximum length of 512; the self-attention layer is configured with 4 attention heads, each attention head having 128 key channels, and using a causal mask; the BiLSTM layer has 10 hidden layer units, and the activation function is ReLU; the regularization layer includes a dropout sub-layer with a dropout rate of 0.01 and an L2 regularization sub-layer with a coefficient of 0.001;

[0019] The attention weights of the self-attention layer are:

[0020]

[0021] Where dk is the dimension of the key vector, and it is computed in parallel by four attention heads to capture temporal dependencies at different scales;

[0022] S3.2 Input each IMF component into the fusion prediction network, output the capacity prediction value of each component through the fully connected layer, and sum the prediction values ​​of all components to obtain the total capacity prediction value;

[0023] S3.3 The Adam optimizer is used to train the fusion prediction network. The training parameters are set as follows: maximum number of training rounds 1000, batch size 256, initial learning rate 0.001, learning rate decrease factor 0.1, learning rate decrease period 800 rounds, gradient clipping threshold 10. The dataset is shuffled round by round during training.

[0024] S4: Remaining lifetime prediction and performance evaluation: Input the test set data into the trained fusion prediction network, output the capacity prediction value and perform inverse normalization.

[0025] Furthermore, the Intrinsic Mode Function (IMF) components include a high-frequency component characterizing short-term capacity fluctuations, a mid-frequency component characterizing medium-term decay fluctuations, and a low-frequency component characterizing long-term aging trends.

[0026] A system for predicting the remaining life of lithium batteries includes the following functional modules that are sequentially connected in communication:

[0027] Data acquisition module: used to acquire time-series data of lithium battery cycle discharge capacity, and supports access to public battery life datasets or battery monitoring data under actual industrial operating conditions;

[0028] Preprocessing module: Built-in sliding window time series reconstruction logic and mapminmax normalization / denormalization algorithm to complete data format conversion and unit unification;

[0029] MVMD decomposition module: used to perform MVMD decomposition parameters and algorithm logic, realizing multi-scale stationarization decomposition of non-stationary capacity signals;

[0030] Hybrid prediction module: Set up a Transformer-BiLSTM fusion prediction network with built-in Adam optimizer training logic to complete the capacity prediction and result fusion of each IMF component;

[0031] Evaluation module: Configures the calculation logic of performance indicators, supports the accuracy evaluation and visualization output of prediction results.

[0032] Furthermore, the hybrid prediction module is also equipped with an adaptive parameter adjustment unit for the decay stage, an EOL dynamic correction unit, a small sample transfer learning unit, and a noise robustness optimization unit.

[0033] A method for predicting the remaining lifetime of lithium batteries based on the MVMD-Transformer-BiLSTM fusion framework, comprising the following steps:

[0034] S1: Data Preprocessing: Collect time-series data of lithium battery cycle discharge capacity, select NASA public datasets or battery capacity data under actual working conditions as the sample set, and divide them into training set and test set; use the sliding window method to reconstruct the time-series data, set the delay step size to 3 and the prediction span to 1, map the capacity data of the first 3 cycles as input features, and the capacity of the 4th cycle as the output label; use the mapminmax function to normalize the data to the [0,1] interval to improve the stability of model training.

[0035] S2: MVMD Multiscale Stabilization Decomposition: Set MVMD parameters: penalty factor α=2000, noise margin τ=0, number of decomposed modes K=10, no DC component, tolerance tol=1e-10; perform multivariate variational mode decomposition on the normalized capacity time series signal to obtain 10 intrinsic mode function (IMF) components, of which high-frequency components (IMF1-IMF3) represent short-term capacity fluctuations, mid-frequency components (IMF4-IMF7) represent mid-term chemical reaction fluctuations, and low-frequency components (IMF8-IMF10) represent long-term decay trends; verify that each IMF component has no mode aliasing and can completely reconstruct the original signal, realizing the stabilization layering of non-stationary signals.

[0036] S3: Transformer-BiLSTM Hybrid Network Modeling

[0037] S3.1 Constructs a fusion network structure, including a sequence input layer, a position embedding layer, a self-attention layer, a BiLSTM layer, a regularization layer, and an output layer:

[0038] The input layer receives temporal feature vectors with a dimension of 3;

[0039] The location embedding layer adds a location code with a maximum length of 512 to preserve timing information;

[0040] The self-attention layer is configured with 4 attention heads, each with 128 channels, and uses causal masking to capture long-range temporal dependencies.

[0041] The BiLSTM layer has 10 hidden layer units and uses ReLU as the activation function to mine local bidirectional temporal features.

[0042] The regularization layer introduces dropout (rate=0.01) and L2 regularization (λ=0.001) to suppress overfitting;

[0043] S3.2 Input each IMF component into the hybrid network for training, output the predicted values ​​of each component through the fully connected layer, and sum them to obtain the total capacity prediction value.

[0044] S4: Model Training and Prediction

[0045] The model was trained using the Adam optimizer with the following parameters: maximum training epochs 1000, batch size 256, initial learning rate 0.001, learning rate descent factor 0.1, descent period 800 epochs, and gradient clipping threshold 10.

[0046] During training, the dataset is shuffled round by round to improve generalization ability. After training, the prediction results are denormalized to restore them to the original capacity units.

[0047] Through 7 indicators (MAE, MBE, MSE, RMSE, R... 2 The accuracy of predictions (RPD, MAPE) is assessed to achieve precise prediction of lithium battery capacity degradation trends and remaining lifespan.

[0048] Corresponding to the above methods, a lithium battery remaining life prediction system is constructed, which includes a data acquisition module, a preprocessing module, an MVMD decomposition module, a hybrid prediction module, and an evaluation module. The modules work together to achieve end-to-end RUL prediction.

[0049] The beneficial effects of this invention are as follows: This invention achieves accurate separation of multi-scale features of battery attenuation and noise through MVMD, solving the problem of non-stationary signal modeling;

[0050] Transformer captures long-term global dependencies, BiLSTM enhances local dynamic details, taking into account both global and local features, and performs dual-branch temporal fusion.

[0051] By combining parameter adjustment during the decay stage with dynamic EOL correction, it adapts to the decay pattern throughout the entire life cycle; it introduces small-sample transfer learning and noise-robust design to meet the needs of actual industrial scenarios. Attached Figure Description

[0052] To make the objectives, technical solutions, and advantages of the present invention clearer, the preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, wherein:

[0053] Figure 1 This is the overall flowchart of the present invention.

[0054] Figure 2 This invention provides a comparison of the capacity decay curves of four NASA batteries.

[0055] Figure 3 The original signal curves of the two batteries in this invention are shown below: (a) original signal of battery B0006, (b) original signal of battery B0018.

[0056] Figure 4 The results are the MVMD decomposition results of the two batteries of the present invention: (a) MVMD decomposition results of battery B0006, and (b) MVMD decomposition results of battery B0018.

[0057] Figure 5 The spectrum diagrams of the two batteries of the present invention are shown below: (a) spectrum diagram of battery B0006, (b) spectrum diagram of battery B0018.

[0058] Figure 6 The image shows the prediction results and cycle error of the B0006 battery training set in this invention; (a) Comparison of prediction results of the B0006 battery training set, (b) Cycle error of the B0006 battery.

[0059] Figure 7 The following are the prediction results and cycle error diagrams for the B0018 battery test set of this invention: (a) Comparison of prediction results for the B0018 battery test set, and (b) Cycle error diagram for the B0018 battery.

[0060] Figure 8 This is the prediction graph of the MVMD-Transformer-BiLSTM of the present invention.

[0061] Figure 9 The images shown are the training set effect diagram and the test set effect diagram of this invention. (a) Training set effect diagram, (b) Test set effect diagram. Detailed Implementation

[0062] The present invention will be further described below with reference to the accompanying drawings and embodiments. It should be noted that the following detailed description is illustrative and intended to provide further explanation of the invention. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains.

[0063] (I) Data Acquisition and Sample Set Construction

[0064] 1. Data source: The lithium battery cycle life dataset publicly available from NASA Ames Prognostics Center of Excellence is used. This dataset contains cycle charge-discharge test data of four 18650 lithium batteries, namely B0005, B0006, B0007, and B0018. The test conditions are: ambient temperature of 25°C, constant current charging to 4.2V at 1.5A, and constant current discharging to 2.7V at 2A. Discharge capacity data is recorded once for each cycle.

[0065] 2. Sample set division: 168 sets of cycle capacity data of B0006 battery were selected as training set (cycle number 1-168), and 175 sets of cycle capacity data of B0018 battery were selected as test set (cycle number 1-175). The battery capacity failure threshold was set to 1.4Ah (i.e., the end of remaining life (EOL) judgment standard).

[0066] (II) Time Series Data Reconstruction

[0067] The sliding window method is used to reconstruct the training and test sets separately. The specific operations are as follows:

[0068] 1. Set the delay step size kim=3 and the prediction span zim=1. That is, for any number of iterations i (i≥4), the input feature vector X=[C(i-3), C(i-2), C(i-1)] is constructed using the capacity data of the i-3, i-2, and i-1 iterations, and the capacity data C(i) of the i-th iteration is used as the output label y.

[0069] 2. After reconstruction, the input feature matrix of the training set has a dimension of (165, 3) (165 samples, 3 features per sample), and the output label vector has a dimension of (165, 1); the input feature matrix of the test set has a dimension of (172, 3), and the output label vector has a dimension of (172, 1).

[0070] (III) Data normalization processing

[0071] The MinMaxScaler tool (equivalent to the mapminmax function) in the Scikit-learn library is called to normalize the input features and output label data to the [0,1] interval. The specific formula (5) is as follows:

[0072] (5)

[0073] The minimum value of the original data in the training set is x_min = 1.382Ah, and the maximum value is x_max = 2.015Ah; the minimum value of the original data in the test set is x_min = 1.357Ah, and the maximum value is x_max = 2.083Ah. These values ​​are independently normalized based on the statistics of the training set and the test set, respectively, to avoid data leakage.

[0074] III. MVMD Decomposition and Implementation Process

[0075] (a) Parameter Configuration

[0076] Parameter settings for MVMD decomposition parameters: penalty factor α=2000, noise margin τ=0, number of decomposition modes K=10, no DC component (DC=0), tolerance tol=1e-10. The variational optimization problem is solved iteratively using the alternating direction multiplier method (ADMM), with the upper limit of the number of iterations set to 1000.

[0077] (ii) Implementation of Multi-Scale Signal Decomposition Layer (MVMD Decomposition)

[0078] 1. Decompose parameter configuration

[0079] The core parameters of MVMD are strictly set as follows: penalty factor α=2000, noise margin τ=0, number of decomposed modes K=10, no DC component (DC=0), and convergence tolerance tol=1e-10. The variational optimization problem is solved iteratively using the alternating direction multiplier method (ADMM), with an upper limit of 1000 iterations.

[0080] 2. Signal Decomposition and Verification

[0081] MVMD decomposition was performed on the normalized training and test set time series signals, respectively, yielding 10 IMF components in each case, where:

[0082] High-frequency components (IMF1-IMF3): characterize short-term random fluctuations in battery capacity;

[0083] Mid-frequency components (IMF4-IMF7): characterize the mid-term capacity fluctuations caused by internal chemical reactions in the battery;

[0084] Low-frequency components (IMF8-IMF10): characterize the long-term capacity degradation trend caused by battery aging;

[0085] Decomposition result verification:

[0086] 1) Time-domain verification: The 10 IMF components are superimposed and reconstructed. The mean square error (MSE) between the reconstructed signal and the original signal is less than 1e. -6 Verify the completeness of the decomposition;

[0087] 2) Frequency domain verification: The spectrum of each IMF component was analyzed by Fast Fourier Transform (FFT). The energy of the high frequency component is concentrated in 0.1-0.4Hz, the mid frequency component is in 0.05-0.3Hz, and the low frequency component is below 0.05Hz. The spectrum of each component does not overlap, verifying no mode aliasing.

[0088] (III) Implementation of Deep Sequence Prediction Layer (Transformer-BiLSTM Hybrid Network)

[0089] 1. Network Structure Construction

[0090] The fusion network consists of the following layers from input to output, with parameters and functions for each layer as follows:

[0091] Sequence input layer: Receives temporal feature vectors with dimension 3, adapted to the dimension of the reconstructed input features;

[0092] Location embedding layer: Add location codes with a maximum length of 512 to preserve the order information of the time series data. The location coding formulas (6) and (7) are:

[0093] (6)

[0094] (7)

[0095] Where d_model=3 is the input feature dimension;

[0096] Self-attention layer: Four attention heads are set, each with 128 key channels. Causal masking is used to avoid future information leakage. The attention weight calculation formula (8) is as follows:

[0097] Attention (8)

[0098] Used to capture long-range temporal dependencies;

[0099] BiLSTM layer: The number of hidden layer units is 10, the activation function is ReLU, and the forward LSTM and backward LSTM process sequence features respectively, which enhances the mining of local bidirectional temporal features;

[0100] Regularization layer: Includes dropout layer (dropout rate=0.01) and L2 regularization (λ=0.001) to suppress model overfitting;

[0101] Fully connected layer + regression output layer: Maps the BiLSTM output to a one-dimensional capacity prediction value, without activation function, suitable for regression prediction scenarios.

[0102] 2. Model Training

[0103] Loss function: The mean squared error (MSE) is used as the training loss function, as shown in formula (9):

[0104] (9)

[0105] Where N is the number of batch samples, y_true,i is the true capacity value, and y_pred,i is the predicted capacity value;

[0106] Optimizer: The Adam optimizer is used, with the following parameters: maximum number of training epochs 1000, batch size 256, initial learning rate 0.001, learning rate descent factor 0.1, learning rate descent period 800 epochs, and gradient clipping threshold 10.

[0107] Training strategy: Input the 10 IMF components into the hybrid network to construct 10 parallel training branches. Shuffle the order of the dataset before each round of training and adopt an early stopping strategy (training is terminated if the test set loss does not decrease for 50 consecutive rounds). After training, sum the predicted values ​​of the IMF components output by each branch to obtain the total capacity prediction value.

[0108] (iv) Implementation of life assessment layer

[0109] 1. Inverse normalization and remaining lifetime calculation

[0110] Inverse normalization: The capacity is restored to its original dimensions through the inverse normalization operation, formula (10):

[0111] (10)

[0112] Remaining life calculation: Set the battery capacity failure threshold to 1.4Ah. Based on the predicted capacity decay curve, find the number of cycles N_pred when the battery capacity first falls below the failure threshold. The current number of cycles is N_current. Then the remaining life RUL = N_pred - N_current.

[0113] 2. Prediction accuracy assessment

[0114] The mean absolute error (MAE), mean deviation error (MBE), mean square error (MSE), root mean square error (RMSE), and coefficient of determination (R²) are used. 2 The model performance is evaluated using seven core metrics, including Residual Prediction Residual (RPD), Mean Absolute Percentage Error (MAPE), and the calculation logic for each metric is as follows, along with the results in this embodiment:

[0115] Table 1 Performance Evaluation Indicators

[0116]

[0117] IV. System Module Deployment and Implementation

[0118] The prediction method of this invention can be packaged into an industrial-grade lithium battery remaining life prediction system, which includes 5 core functional modules. The functions and deployment logic of each module are as follows:

[0119] 1. Data Acquisition Module: Supports two data access methods—offline import of historical datasets in .mat / .csv format, and reading real-time capacity data from the NEWALE battery test cabinet via RS485 serial port. The data is stored in a standardized CSV format.

[0120] 2. Preprocessing module: Built-in sliding window reconstruction, normalization / denormalization algorithms, one-click completion of data format conversion and unit unification, outputting a standardized feature matrix;

[0121] 3. MVMD Decomposition Module: Solidifies decomposition parameters and verification logic, automatically performs signal decomposition and visualizes IMF components and time-frequency characteristics, and supports alarms for abnormal decomposition results;

[0122] 4. Hybrid Prediction Module: Loads a pre-trained Transformer-BiLSTM network model, supports batch prediction (processing historical data) and real-time prediction (processing second-level collected data), and has built-in adaptive parameter adjustment during decay stage, dynamic EOL correction, few-shot transfer learning, and noise robustness optimization logic.

[0123] 5. Evaluation Module: Automatically calculates 7 evaluation indicators and generates visual reports such as capacity prediction curves, error distribution maps, and regression fitting maps, supporting export in PDF / Excel format.

[0124] The system operation process is as follows: system startup → data acquisition / import → preprocessing → MVMD decomposition → model prediction → lifetime calculation → evaluation report generation. The entire process is automated, with a response time of ≤5s (1000 data points per batch).

[0125] V. Verification of Implementation Results

[0126] In this embodiment, the predicted curve of the training set B0006 battery has a very high degree of fit with the true curve (R0). 2 =0.998), during the rapid degradation phase of the B0018 battery in the 120-175 cycle test set, the RMSE remained below 0.036 Ah, and the remaining lifetime prediction error was ≤5 cycles; compared with the traditional single BiLSTM model, the RMSE of this invention was reduced by 43.75%, R 2 With an improvement of 6.19 percentage points, its generalization and anti-interference capabilities have been significantly enhanced, and it can be directly deployed in industrial scenarios such as new energy vehicle battery management systems (BMS) and energy storage power station battery health monitoring platforms.

[0127] Using the NASA dataset B0006 battery as the training set and B0018 battery as the test set, the model was trained and predicted according to the steps described above. The model was verified to have an MAE of 0.006 and a MAPE of 0.32% on the training set, and an MAE of 0.028 and a MAPE of 1.56% on the test set. The RPD value was greater than 3, which meets the accuracy and reliability requirements for engineering applications.

Claims

1. A method for predicting the remaining lifetime of lithium batteries based on the MVMD-Transformer-BiLSTM fusion framework, characterized in that, Includes the following steps: S1: Time-series data preprocessing: Collect time-series data of lithium battery cycle discharge capacity, construct a sample set and divide it into training and test sets; reconstruct the time-series data using the sliding window method; normalize the input and output data using the mapminmax function; S2: Multivariate Variational Mode Decomposition (MVMD) Processing: Set the MVMD decomposition parameters as follows: penalty factor α=2000, noise margin τ=0, number of decomposed modes K=10, no DC component, tolerance tol=1e-10; Perform MVMD decomposition on the normalized capacity time-series signal to obtain the intrinsic mode function (IMF) components; MVMD decomposition achieves synchronous decomposition of multi-channel signals by solving a variational optimization problem. Its objective function and constraints are as follows: ; Minimize the objective function: ; Where, C: the total number of signal channels; K: the total number of modes to be decomposed; f(c)(t): the original input signal of the c-th channel; uk(c)(t): the k-th mode component of the c-th channel; ωk: the common center frequency of the k-th mode; δ(t): the Dirac Delta function; j: the imaginary unit; : Convolution operation; ∂t: Partial derivative with respect to time; S3: Transformer-BiLSTM hybrid network training: S3.1 Construct a fusion prediction network, which sequentially includes a sequence input layer, a position embedding layer, a self-attention layer, a BiLSTM layer, a regularization layer, a fully connected layer, and a regression output layer; wherein, the position embedding layer is configured with position codes of a maximum length of 512; the self-attention layer is configured with 4 attention heads, each attention head having 128 key channels, and using a causal mask; the BiLSTM layer has 10 hidden layer units, and the activation function is ReLU; the regularization layer includes a dropout sub-layer with a dropout rate of 0.01 and an L2 regularization sub-layer with a coefficient of 0.001; The attention weights of the self-attention layer are: ; Where dk is the dimension of the key vector, and it is computed in parallel by four attention heads to capture temporal dependencies at different scales; S3.2 Input each IMF component into the fusion prediction network, output the capacity prediction value of each component through the fully connected layer, and sum the prediction values ​​of all components to obtain the total capacity prediction value; S3.3 The Adam optimizer is used to train the fusion prediction network. The training parameters are set as follows: maximum number of training rounds 1000, batch size 256, initial learning rate 0.001, learning rate decrease factor 0.1, learning rate decrease period 800 rounds, gradient clipping threshold 10. The dataset is shuffled round by round during training. S4: Remaining lifetime prediction and performance evaluation: Input the test set data into the trained fusion prediction network, output the capacity prediction value and perform inverse normalization.

2. The method for predicting the remaining life of a lithium battery according to claim 1, characterized in that, The Intrinsic Mode Function (IMF) components include a high-frequency component characterizing short-term capacity fluctuations, a mid-frequency component characterizing medium-term decay fluctuations, and a low-frequency component characterizing long-term aging trends.

3. A system for implementing the lithium battery remaining life prediction method according to claim 1 or 2, characterized in that, The following functional modules are connected in sequence: Data acquisition module: used to acquire time-series data of lithium battery cycle discharge capacity, and supports access to public battery life datasets or battery monitoring data under actual industrial operating conditions; Preprocessing module: Built-in sliding window time series reconstruction logic and mapminmax normalization / denormalization algorithm to complete data format conversion and unit unification; MVMD decomposition module: used to perform MVMD decomposition parameters and algorithm logic, realizing multi-scale stationarization decomposition of non-stationary capacity signals; Hybrid prediction module: Set up a Transformer-BiLSTM fusion prediction network with built-in Adam optimizer training logic to complete the capacity prediction and result fusion of each IMF component; Evaluation module: Configures the calculation logic of performance indicators, supports the accuracy evaluation and visualization output of prediction results.

4. The lithium battery remaining life prediction system according to claim 3, characterized in that, The hybrid prediction module is also equipped with an adaptive parameter adjustment unit for the decay stage, an EOL dynamic correction unit, a small sample transfer learning unit, and a noise robustness optimization unit.