A double-branch time sequence prediction method based on frequency amplitude shaping and cycle alignment
By employing a two-branch time series prediction method that combines frequency amplitude shaping and period alignment, the periodic component and residual component are explicitly decoupled, thus solving the problems of amplitude scale imbalance and learning interference, and improving the accuracy and stability of time series prediction.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ZHEJIANG NORMAL UNIV
- Filing Date
- 2026-05-13
- Publication Date
- 2026-06-19
AI Technical Summary
Existing time series prediction models suffer from amplitude scale imbalance and joint learning interference when dealing with high-amplitude periodic components and irregular residual components, leading to decreased prediction accuracy and systematic shifts in frequency domain monitoring signals.
A two-branch time series prediction method with frequency amplitude shaping and period alignment is adopted. The period component and residual component are extracted by frequency domain decomposition, a multilayer perceptron is used for preliminary prediction, and the amplitude is corrected by the frequency amplitude shaping module. The prediction results are optimized by combining frequency domain and time domain loss functions.
Explicitly decoupling the periodic component and the residual component preserves amplitude information, avoids gradient imbalance, improves prediction accuracy, and solves the problem of frequency domain supervision signal offset. The backbone prediction model focuses on complex residual modeling, thereby improving the overall prediction performance.
Smart Images

Figure CN122241121A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of time series prediction technology, and more specifically to a two-branch time series prediction method based on frequency amplitude shaping and period alignment. Background Technology
[0002] Currently, time series forecasting has significant applications in fields such as power systems, traffic management, and weather forecasting. In recent years, deep learning methods have made remarkable progress in this field, encompassing models based on recurrent neural networks (RNNs), Transformer-type models, and MLP-type models.
[0003] However, in real time series, high-amplitude periodic components and irregular residual components are often intertwined, which presents two difficulties for the learning of prediction models:
[0004] First, the problem of amplitude scale imbalance; In joint training based on mean squared error (MSE), a significant difference in amplitude scale exists between the periodic component and the residual component. This causes the gradient of the loss function to be dominated by the high-amplitude periodic component, suppressing the model's ability to learn fine-grained residual components. Existing normalization methods, such as RevIN, SAN, and DishTS, can alleviate the scale imbalance, but these operations simultaneously erase the absolute amplitude information in the periodic component. Since amplitude is crucial for period prediction, this inevitably sacrifices prediction accuracy.
[0005] Second, the problem of interference in joint learning; When a model is forced to learn both structured periodic patterns and irregular residual changes simultaneously, the learning interference between the two reduces the model's ability to capture fine-grained residual patterns, thus affecting the overall prediction accuracy.
[0006] Existing time series decomposition methods, such as Autoformer and FEDformer, embed the decomposition module within the network, failing to achieve explicit decoupling between periodic and residual components. Frequency domain analysis methods, such as FiLM and FreTS, perform frequency domain enhancement within the network, similarly failing to fundamentally address the aforementioned challenges. Furthermore, existing frequency domain loss methods directly calculate the loss based on the frequency index. Since the frequency indices corresponding to the same physical period p in the input sequence (length T) and the predicted sequence (length H) are T / p and H / p respectively, they are inconsistent when T ≠ H, leading to a systematic shift in the frequency domain supervision signal. This results in the frequency domain constraints during training failing to correspond to the actual physical period.
[0007] Therefore, how to provide a prediction framework that can explicitly decouple the periodic components and residual components while preserving amplitude information and providing accurate frequency domain supervision is a problem that needs to be solved by those skilled in the art. Summary of the Invention
[0008] In view of the above problems, the present invention is proposed to provide a two-branch time series prediction method based on frequency amplitude shaping and period alignment that overcomes or at least partially solves the above problems.
[0009] To achieve the above objectives, the present invention adopts the following technical solution: This invention provides a two-branch time series prediction method based on frequency amplitude shaping and period alignment, specifically including the following steps: S1. Perform frequency domain decomposition on the input time series, and extract the Top-K periodic components and residual components based on the frequency domain decomposed time series. S2. Without normalization, the periodic component is input to a multilayer perceptron-based predictor for preliminary prediction, and the result of the preliminary prediction is linearly transformed by a frequency amplitude shaping module to obtain the prediction result of the periodic component. S3. Input the residual components into the preset backbone prediction model for prediction to obtain the residual component prediction results; S4. Construct a frequency domain loss function and a time domain loss function, and optimize the prediction results of the periodic component and the prediction results of the residual component based on the hybrid loss function composed of the frequency domain loss function and the time domain loss function. S5. Based on the optimized prediction results of the periodic component and the prediction results of the residual component, the final prediction result is obtained.
[0010] Furthermore, in step S1, the specific process of frequency domain decomposition includes: Applying a Fast Fourier Transform along the time dimension to the input time series yields its frequency domain representation; After excluding the DC component in the frequency domain representation, the Top-K frequency components with the largest amplitudes are selected for each channel. Based on the selected Top-K frequency components, the periodic components are reconstructed by inverse real fast Fourier transform. The residual component is obtained by subtracting the periodic component from the time series.
[0011] Further, in step S2, when the predictor based on the multilayer perceptron makes a preliminary prediction, it first extracts features from the periodic components through a linear layer and an activation function, then concatenates the extracted features with the original input time series along the channel dimension to explicitly preserve phase information, and finally maps the concatenated features to the prediction length through the multilayer perceptron to obtain the preliminary prediction result.
[0012] Further, in step S2, the frequency amplitude shaping module includes: The preliminary prediction results are transformed to the frequency domain using a real-number fast Fourier transform to obtain the frequency domain prediction representation; Using a preset learnable weight matrix, element-wise multiplication is performed on each frequency component and each channel in the frequency domain prediction representation to achieve the linear transformation. The result of element-wise multiplication is transformed back to the time domain using an inverse real fast Fourier transform to obtain the predicted result of the periodic component.
[0013] Further, in step S4, the frequency domain loss function is constructed based on a period alignment mechanism, which includes: For any physical period p identified from the frequency domain decomposition of the input time series according to claim 1, determine the continuous spectral position u_p=H / p corresponding to the physical period p in the frequency domain of a predicted sequence, where H is the length of the predicted sequence; When the continuous spectrum position u_p is not an integer, a linear interpolation method is used to estimate the amplitude value at the continuous spectrum position based on the spectral amplitude values at the adjacent integer indices on both sides of the continuous spectrum position.
[0014] Furthermore, the frequency domain loss function is defined as the average truncation relative amplitude error over the set consisting of the Top-K physical periods, where the Top-K physical periods are the set of physical periods p. The formula for calculating the frequency domain loss function is:
[0015] Where ε is the numerical stability term, τ is the upper cutoff bound, K is the number of selected frequency components, A is the function for calculating the amplitude of the given sequence at the physical period p, p is the physical period, and Y is the actual sequence. For the prediction results of periodic components, P K It is a set consisting of the Top-K physical cycles extracted from the input time series.
[0016] Further, in step S4, the time-domain loss function is defined as the sum of the mean square errors calculated for the periodic component and the residual component respectively: L time =MSE( ,Y(periodic) )+MSE( (res) ,Y (res) ) in, (periodic) For the prediction results of periodic components, (res) For the prediction results of the residual components, Y (periodic) Y is the true periodic component in the time domain. (res) This represents the true residual component in the time domain.
[0017] Further, in step S4, the hybrid loss function is composed of the time-domain loss function and the frequency-domain loss function weighted and summed according to preset weights, and its expression is: L=L time +λ·L freq Where λ is the balance coefficient, L time Let L be the time-domain loss function. freq This is the frequency domain loss function.
[0018] Furthermore, in step S3, before the residual components are input into the backbone prediction model for prediction, they are first subjected to reversible instance normalization. After the backbone prediction model outputs the results, they are then restored to the original scale through inverse normalization to obtain the residual component prediction results.
[0019] Further, in step S1, the value of Top-K is determined based on the spectral characteristics of the dataset, and its value range is 1 or 2; specifically, it is determined through the energy concentration index E. c Assessment, E c The proportion of the energy of the first K dominant frequency components to the total spectral energy, when E c When (K) is close to 1, determine the value of K.
[0020] As can be seen from the above technical solution, compared with the prior art, the present invention discloses a dual-branch time series prediction method based on frequency amplitude shaping and period alignment, which has the following beneficial effects: 1. This invention solves the gradient imbalance caused by amplitude scale differences by explicit frequency domain decoupling, enabling the backbone prediction model to focus on modeling complex residuals.
[0021] 2. The periodic component branch directly processes the original-scale periodic components under non-normalization conditions, preserving the absolute amplitude information that is crucial for period prediction.
[0022] 3. The frequency amplitude shaping module applies independent learnable weights to each frequency component in the frequency domain, enabling fine control over the predicted amplitude. The parameter magnitude is O(H×C), which is negligible compared to the backbone prediction model.
[0023] 4. The period alignment mechanism maps the physical period of the input sequence to the frequency index of the prediction window through linear interpolation, avoiding the spectral misalignment problem caused by the difference in sequence length. Attached Figure Description
[0024] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0025] Figure 1 This is a flowchart of the dual-branch time series prediction method provided in the embodiments of the present invention; Figure 2 This is a schematic diagram of the overall framework of FAPA provided in an embodiment of the present invention; Figure 3 This is a spectral amplitude diagram of the ECL dataset provided in this embodiment of the invention; Figure 4 This is a comparison chart of frequency domain decomposition results on the traffic dataset provided in this embodiment of the invention; Figure 5 This is a comparison chart of the main experimental results provided in the embodiments of the present invention; Figure 6 This is a comparison chart of ablation experimental results on the ETTm2 and ECL datasets provided in this embodiment of the invention. Detailed Implementation
[0026] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0027] This invention discloses a two-branch time series prediction method based on frequency amplitude shaping and period alignment, such as... Figure 1 As shown, it includes the following steps: S1. Perform frequency domain decomposition on the input time series, and extract the Top-K periodic components and residual components based on the frequency domain decomposed time series. S2. Without normalization, the periodic component is input to a multilayer perceptron-based predictor for preliminary prediction, and the result of the preliminary prediction is linearly transformed by a frequency amplitude shaping module to obtain the prediction result of the periodic component. S3. Input the residual components into the preset backbone prediction model for prediction to obtain the residual component prediction results; S4. Construct a frequency domain loss function and a time domain loss function, and optimize the prediction results of the periodic component and the prediction results of the residual component based on the hybrid loss function composed of the frequency domain loss function and the time domain loss function. S5. Based on the optimized prediction results of the periodic component and the prediction results of the residual component, the final prediction result is obtained.
[0028] This invention addresses gradient imbalance caused by amplitude scale differences through explicit frequency domain decoupling, enabling the backbone prediction model to focus on modeling complex residuals.
[0029] The following is a detailed description of each of the above steps; In step S1, the input time series is decomposed in the frequency domain to extract the periodic component and the residual component; The dual-branch temporal prediction framework FAPA of the present invention is as follows: Figure 2 As shown in the figure, the three modules involved in the entire process and the input and output flow of the framework are shown respectively. Among them, a is the decomposition module, which extracts the periodic component and the residual component through Top-K frequency selection during the frequency domain decomposition process; b is the lightweight MLP predictor, which concatenates the original input with the frequency features to retain the phase information; and c is the frequency amplitude shaping module, which corrects the predicted amplitude in the frequency domain.
[0030] First, given a historical prediction of a multivariate time series X∈R (T×C) Where T is the input length, C is the number of variables, and the prediction target is the sequence Y∈R for the next H steps. (H×C) A real-valued Fast Fourier Transform (RFFT) is applied along the time dimension to obtain its frequency domain representation. To ensure that the Top-K frequency selection focuses on the dynamic periodic pattern rather than the static mean, the DC component (zero-frequency component) is excluded. The sequence mean information is then retained in the residual components and processed uniformly by the normalization layer of the subsequent backbone model. For each channel, the Top-K frequency components with the largest amplitudes are selected (K is the number of selected frequency components, usually 1 or 2), and their corresponding frequency index set is denoted as K', representing the position index of these K largest amplitude frequency components in the spectrum.
[0031] In this embodiment, the K value can be flexibly adjusted according to the spectral structure of the dataset, ranging from 1 to 2. It is primarily determined based on the spectral characteristics of the dataset, specifically through the energy concentration index E. c Assessment. E c Defined as the proportion of the energy of the first K dominant frequency components to the total spectral energy, its formula (1) is as follows: (1) Where K is the number of dominant frequency components selected; f i X(f) represents the i-th largest frequency component after amplitude sorting; i ) represents frequency f i The corresponding complex spectrum value; |X(f i |² represents the energy of that frequency component, i.e., the square of the amplitude; the numerator Σ|X(f i |² represents the sum of the energy of the first K frequency components with the largest amplitudes, with the summation ranging from i to k; the denominator Σ|X(f)|² represents the total energy of the input sequence at all non-zero frequencies, with the summation iterating through all frequency components except the DC component. This index is used to quantify the proportion of the first K dominant frequency components in the total periodic energy, when E c When (K) is close to 1, it indicates that K components can effectively characterize the periodicity of the original signal.
[0032] In this embodiment, as Figure 3 As shown, a spectral analysis is performed using the ECL dataset as an example. In the spectral amplitude plot, the energy of the periodic component is concentrated at several obvious frequency peaks, with amplitude values significantly higher than other frequency components. The figure clearly shows the amplitude difference between some periodic components and the residual. When K=2, the energy concentration index E_c reaches 0.82, indicating that the two dominant frequency components can capture more than 80% of the periodic energy. In this case, choosing K=2 can effectively separate the periodic component from the residual component and avoid introducing too much noise. For datasets with weaker periodicity, such as Traffic, when K=1, E_c... c The value is 0.52. When K=2, it only increases to 0.65, which is a limited increase. Therefore, K=1 is chosen to avoid introducing unstable high-frequency components.
[0033] For a frequency with index i, its corresponding physical period is p = T / i. For example, in data with a sampling interval of 1 hour and an input length T = 96, frequency index i = 4 corresponds to a 24-hour period (daily period). A frequency domain mask is constructed, retaining only the Top-K frequency components, and the periodic component X is reconstructed using an inverse real fast Fourier transform (iRFFT). (periodic) Subtracting it from the original signal yields the residual component X. (res) That is: X=X (periodic) +X (res) .
[0034] like Figure 4 As shown in the figure, the frequency domain decomposition results on the traffic dataset are illustrated. The black line represents the original sequence, the blue line represents the extracted periodic components, and the red line represents the residuals. After frequency domain decomposition, the amplitude of the residual components is significantly reduced compared to the original sequence, and the fluctuation amplitude is reduced. This helps to reduce the difficulty of fitting the residual components to the backbone prediction model.
[0035] In step S2, the periodic components are predicted; In step S1, the periodic components obtained through frequency domain decomposition are used to predict future periodic components through two stages of the periodic component branch path, namely... Figure 2 After mid-frequency domain decomposition, the data is input to the flow branch of the MLP predictor. The set of physical periods corresponding to the Top-K frequency components obtained from the frequency domain decomposition is denoted as P_k, and is used to construct the subsequent frequency domain loss function. The set of frequency indices corresponding to these components is denoted as K', and is used to construct a frequency domain mask to reconstruct the periodic components.
[0036] Predictor based on multilayer perceptron (MLP) reconstructs periodic component X (periodic) The reconstructed periodic components include frequency and amplitude information, but lack contextual information from the original sequence, such as the current phase position and local trend. The predictor processes both the periodic signal and the original input: the periodic components are first processed through a linear layer and activation function for feature extraction. The extracted features are then concatenated with the original input X to explicitly preserve phase information and ensure that the prediction result is phase-aligned with the original sequence. Finally, the concatenated features are mapped to the prediction length by an MLP to obtain the preliminary prediction result, which is given by formula (2) as follows: (2) Where Linear(·) is the linear projection layer, σ is the activation function, MLP(·) is the multilayer perceptron projection, ⊙ represents element-wise multiplication, and Concat(·) is the concatenation operation along the channel dimension. Let the original input sequence X∈R L×C The periodic components retain their dimension R after linear mapping. L×C The splicing operation is performed along the feature dimension. After splicing, the feature dimension is 2C. The MLP takes this as input and restores the output dimension to C, thus preserving the phase information without changing the input specifications of subsequent modules.
[0037] To improve the amplitude accuracy of periodic component prediction, a frequency amplitude shaping module is introduced, which applies an independent, learnable transformation to each frequency component in the frequency domain, such as... Figure 2 The c module is shown in the figure. After converting the initial prediction results of the MLP output to the frequency domain, a learnable linear transformation is applied to each frequency and each channel, and its formula (3) is as follows: (3) Here, W is the learnable weight matrix with H / 2+1 rows and C columns equal to the number of channels, and ⊙ represents element-wise multiplication. W is initialized as an all-1 matrix, and the weight corresponding to the DC component is always kept at 1 to avoid affecting the sequence mean. This module adopts a channel-independent design. The period amplitudes of different variables may vary significantly. Channel-level parameters allow the module to independently learn amplitude correction coefficients for each variable. The module has only O(H×C) parameters, which is negligible compared to the backbone prediction model.
[0038] In step S3, the residual components are predicted; In step S1, the residual components obtained through frequency domain decomposition are used to predict future residual components via residual component branch paths, i.e. Figure 2 After mid-frequency domain decomposition, the normalized input is fed into the flow branch of the backbone model; First, the residual component X obtained by frequency domain decomposition... (res) Processing is performed. To ensure a fair comparison with baseline methods, the default configuration of the backbone prediction model is maintained.
[0039] Taking the PatchTST backbone prediction model as an example, the residual components are first standardized by RevIN (Reversible Instance Normalization), fed into the backbone prediction model to obtain predictions, and then restored to the original scale by inverse normalization to obtain... (res) This framework does not modify the internal structure of the backbone prediction model; it only uses residual input to allow the backbone to focus on modeling complex residual components, thereby reducing the learning difficulty.
[0040] In step S4, the frequency domain loss function is constructed; The fundamental challenge in calculating the time series decomposition loss is that the same physical period p corresponds to different frequency indices in sequences of different lengths. For example, the daily period (p=24h) corresponds to index k=4 in the input length T=96, but to index k=30 in the prediction length H=720. To address this, this invention uses the physical period scale rather than the original frequency index as the basis for alignment. For the physical period p identified from the input, where p is calculated from the frequency index i according to p=T / i, its corresponding spectral position in the prediction sequence is H / p. Since this position is usually non-integer, linear interpolation is used to estimate the amplitude at this consecutive position, as shown in formula (4) below: (4) Where S is the amplitude spectrum of the sequence, which is obtained by taking the modulus after applying RFFT to the predicted sequence, and u p Let p be the spectral position corresponding to the physical period p in the predicted sequence, and w be the interpolation weight, w=u p , and These are rounding down and rounding up operations, respectively. This mechanism ensures that the frequency domain constraints and the time domain objectives are always aligned with the same set of physical periods, avoiding semantic misalignment between the two.
[0041] Frequency domain constraint loss: The frequency domain loss aims to ensure that the predicted periodic component matches the dominant periodic amplitude of the true value. It is defined as the average truncation relative amplitude error on the Top-K physical period set P_k extracted from the input, and its formula (5) is as follows: (5) Where ε is a numerical stability term used to prevent the denominator from being zero; τ is the upper bound for truncation; K is the number of selected frequency components; p is the physical period; the truncation operation can suppress abnormal gradients and improve training stability; A is a function for calculating the amplitude of a given sequence at the physical period p; Y is the real sequence; in this embodiment, τ is taken as 2.0, but in practical applications, it can be adjusted within the range of 1.5 to 3.0 according to the amplitude characteristics of the dataset.
[0042] Time-domain constraint loss: Time-domain supervision calculates MSE for both the periodic component and the residual component separately: Ltime = MSE( (periodic) ,Y (periodic) )+MSE( (res) ,Y (res) ), crucially, the real time-domain target Y (periodic) and Y (res) The same period alignment mechanism is used to construct the model, which forces the true decomposition to use the exact same physical period set P_k as that identified in the input X, ensuring that the supervision signal of the periodic component branch is physically consistent with the decomposition logic of the model.
[0043] The total loss of the hybrid loss function is defined as L = L time +λ·L freq The system experiments on five datasets, ECL, Traffic, Weather, ETTm1, and ETTm2, have verified that a value of 0.1 achieves optimal or near-optimal prediction performance on all datasets. In this embodiment, λ is fixed at 0.1.
[0044] In step S5, the final prediction result is output; The prediction results of the two branches are added together to obtain the final prediction output: = (periodic)+ (res).
[0045] Example: Taking a power load dispatching scenario as an example; The data source for this embodiment is the ECL public dataset, which contains hourly electricity consumption data from 321 users, spanning from 2012 to 2014. The original data was divided into training, validation, and test sets in a 7:1:2 ratio. The input sequence length T=96 (corresponding to 4 days of historical data), and the prediction length H=96 (corresponding to the next 4 days). Each variable was standardized using RevIN based on the mean and standard deviation of the training set, but this standardization was only used as input to the backbone model of the residual branch; the periodic component branch directly used the original scale data.
[0046] Model Input and Decomposition: The historical 96-hour load sequence X∈R of a certain electricity consumption node. (96×321) As input, an RFFT is applied to exclude the DC component, and the amplitude spectrum of each channel is calculated, with K=2 (corresponding to diurnal and bi-diurnal components with periods of approximately 24 hours and 48 hours). The periodic component X is reconstructed using iRFFT. (periodic) And calculate the residual X (res) =X X (periodic) .
[0047] Branch prediction and fusion: Periodic component branching at the original scale X (periodic) The original input X is used as input, and the MLP predictor obtains a preliminary period prediction. Then, the frequency amplitude shaping module corrects the amplitude in the frequency domain, and the output is obtained. (periodic) The residual component branch will X (res) After RevIN normalization, the data is fed into the PatchTST backbone prediction model, and after inverse normalization, the output is... (res) Final prediction output = (periodic) + (res) The shape is R (96×321) .
[0048] Training and Loss: Using a hybrid loss method L=L time +λ·L freq (λ=0.1) Training is performed using the Adam optimizer with a learning rate of 1×10⁻³ for 100 epochs, and an early stopping patience value of 10 epochs.
[0049] Scheduling guidance based on forecast results: This involves forecasting the load on each node for the next 96 hours. After aggregation, the data is output to the upper-level application system and can be used in the following typical scenarios: First, the power grid dispatching system can plan the start-up and shutdown schedules and output allocation of generator units in advance based on the predicted load curves of each node, reducing the real-time dispatching pressure caused by sudden load increases; Second, the renewable energy consumption management system can optimize the grid connection plans of wind power and photovoltaic power based on predicted off-peak periods. On the test set of this embodiment, compared with using PatchTST alone, FAPA+PatchTST reduced the MSE from 0.349 to 0.303, a reduction of 13.08%. The improvement in prediction accuracy directly reduces the safety margin introduced by prediction deviation in dispatching decisions, which helps to reduce the redundant configuration of reserve capacity and verifies the practical value of this framework in power load forecasting scenarios.
[0050] Effect verification: like Figure 5 As shown, this embodiment combines the FAPA framework with four backbone prediction models: PatchTST, DLinear, iTransformer, and TSMixer. Systematic validation was performed on five benchmark datasets: ECL (electricity load dataset), Traffic (traffic flow dataset), Weather (weather dataset), ETTm1, and ETTm2 (electric transformer temperature dataset), with prediction lengths H ∈ {96, 192, 336, 720}. The MSE (mean squared error) and MAE (mean absolute error) metrics are reported. In the table, PatchTST is a channel-independent Transformer model based on a block-based mechanism; DLinear is a linear model that decomposes the sequence into trend and seasonal components; iTransformer is an improved Transformer that applies attention in the channel dimension; and TSMixer is an MLP-Mixer architecture model that alternates between temporal and feature mixing. The last row, "Average," represents the average performance across all five datasets and all prediction lengths; bold numbers indicate the best results.
[0051] On channel-independent (CI) models (PatchTST, DLinear), FAPA achieved the most significant performance improvements on the ECL and Traffic datasets (MSE reduction of approximately 4.5%–14.3%), and also showed stable improvements on the ETTm1 and Weather datasets. Specifically, when combined with PatchTST on the ECL dataset, the average MSE across all prediction lengths decreased from 0.349 to 0.332, a reduction of 13.08%. From short-term predictions (H=96) to long-term predictions (H=720), FAPA consistently improved the performance of CI models. The additional parameters introduced by FAPA did not exceed 3.37% of the backbone prediction model.
[0052] like Figure 6As shown, this embodiment uses PatchTST as the backbone prediction model and conducts ablation experiments on two benchmark datasets, ETTm2 and ECL, with prediction lengths H∈{96,192,336,720}. The contributions of the full FAPA framework and each module to prediction performance are evaluated. The figure reports the MSE (mean squared error) and MAE (mean absolute error) metrics. Ablation settings include: full FAPA (Full), a variant removing the period alignment mechanism (-PeriodAlign), a variant removing the frequency amplitude shaping module (-AmpShape), and a variant removing both (-Both).
[0053] Experimental results show that removing any module will lead to a decrease in prediction performance, with the most significant performance degradation occurring when two modules are removed simultaneously. This verifies the complementary role of the period alignment mechanism and the frequency amplitude shaping module in the framework.
[0054] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since they correspond to the methods disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to the method section.
[0055] The above description of the disclosed embodiments enables those skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the invention is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims
1. A two-branch time series prediction method based on frequency amplitude shaping and period alignment, characterized in that, Includes the following steps: S1. Perform frequency domain decomposition on the input time series, and extract the Top-K periodic components and residual components based on the frequency domain decomposed time series. S2. Without normalization, the periodic component is input to a multilayer perceptron-based predictor for preliminary prediction, and the result of the preliminary prediction is linearly transformed by a frequency amplitude shaping module to obtain the prediction result of the periodic component. S3. Input the residual components into the preset backbone prediction model for prediction to obtain the residual component prediction results; S4. Construct a frequency domain loss function and a time domain loss function, and optimize the prediction results of the periodic component and the prediction results of the residual component based on the hybrid loss function composed of the frequency domain loss function and the time domain loss function. S5. Based on the optimized prediction results of the periodic component and the prediction results of the residual component, the final prediction result is obtained.
2. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S1, the specific process of frequency domain decomposition includes: Applying a Fast Fourier Transform along the time dimension to the input time series yields its frequency domain representation; After excluding the DC component in the frequency domain representation, the Top-K frequency components with the largest amplitudes are selected for each channel. Based on the selected Top-K frequency components, the periodic components are reconstructed by inverse real fast Fourier transform. The residual component is obtained by subtracting the periodic component from the time series.
3. The bi-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S2, when the predictor based on the multilayer perceptron makes a preliminary prediction, it first extracts features from the periodic components through a linear layer and an activation function, then concatenates the extracted features with the original input time series along the channel dimension to explicitly preserve phase information, and finally maps the concatenated features to the prediction length through the multilayer perceptron to obtain the preliminary prediction result.
4. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S2, the frequency amplitude shaping module includes: The preliminary prediction results are transformed to the frequency domain using a real-number fast Fourier transform to obtain the frequency domain prediction representation; Using a preset learnable weight matrix, element-wise multiplication is performed on each frequency component and each channel in the frequency domain prediction representation to achieve the linear transformation. The result of element-wise multiplication is transformed back to the time domain using an inverse real fast Fourier transform to obtain the predicted result of the periodic component.
5. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S4, the frequency domain loss function is constructed based on a period alignment mechanism, which includes: For any physical period p identified from the frequency domain decomposition of the input time series according to claim 1, determine the continuous spectral position u_p=H / p corresponding to the physical period p in the frequency domain of a predicted sequence, where H is the length of the predicted sequence; When the continuous spectrum position u_p is not an integer, a linear interpolation method is used to estimate the amplitude value at the continuous spectrum position based on the spectral amplitude values at the adjacent integer indices on both sides of the continuous spectrum position.
6. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 5, characterized in that, The frequency domain loss function is defined as the average truncation relative amplitude error over the set consisting of the Top-K physical periods, where the Top-K physical periods are the set of physical periods p. The formula for calculating the frequency domain loss function is: Where ε is the numerical stability term, τ is the upper cutoff bound, K is the number of selected frequency components, A is the function for calculating the amplitude of the given sequence at the physical period p, p is the physical period, and Y is the actual sequence. For the prediction results of periodic components, P K It is a set consisting of the Top-K physical cycles extracted from the input time series.
7. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S4, the time-domain loss function is defined as the sum of the mean square errors calculated for the periodic component and the residual component respectively: L time =MSE( ,Y (periodic) )+MSE( (res) ,Y (res) ) in, (periodic) For the prediction results of periodic components, (res) For the prediction results of the residual components, Y (periodic) Y is the true periodic component in the time domain. (res) This represents the true residual component in the time domain.
8. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S4, the hybrid loss function is composed of the time-domain loss function and the frequency-domain loss function weighted and summed according to preset weights, and its expression is: L = L time + λ · L freq Where λ is the balance coefficient, L time Let L be the time-domain loss function. freq This is the frequency domain loss function.
9. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S3, before the residual components are input into the backbone prediction model for prediction, they are first subjected to reversible instance normalization. After the backbone prediction model outputs the results, they are then restored to the original scale by inverse normalization to obtain the prediction results of the residual components.
10. The dual-branch time series prediction method based on frequency amplitude shaping and period alignment as described in claim 1, characterized in that, In step S1, the value of Top-K is determined based on the spectral characteristics of the dataset, and the value range is 1 or 2; specifically, it is determined by the energy concentration index E. c Assessment, E c The proportion of the energy of the first K dominant frequency components to the total spectral energy, when E c When (K) is close to 1, determine the value of K.