A stock index prediction method based on 1D CNN-ResBiLSTM

By using the 1DCNN-ResBiLSTM model, which combines a one-dimensional convolutional network and a bidirectional LSTM network, the problem of insufficient local feature extraction in stock index prediction is solved. This enables accurate characterization of short-term stock price changes and capture of long-term dependencies, thereby improving prediction accuracy and stability.

CN122243640APending Publication Date: 2026-06-19FUDAN UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
FUDAN UNIVERSITY
Filing Date
2026-04-07
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, stock index prediction models have weak local feature extraction capabilities and cannot effectively uncover local time-series patterns in short-term stock price changes, resulting in low prediction accuracy.

Method used

A 1DCNN-ResBiLSTM model is adopted, which extracts local features through a one-dimensional convolutional neural network (1DCNN), learns long-term dependencies through a bidirectional long short-term memory network (BiLSTM), and makes predictions by combining a residual feature fusion mechanism with a fully connected layer.

Benefits of technology

It improves the accuracy and stability of stock index forecasting, enabling more precise capture of local fluctuation characteristics and long-term dependencies in stock time series, thus enhancing the overall accuracy of forecasting.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243640A_ABST
    Figure CN122243640A_ABST
Patent Text Reader

Abstract

This invention relates to the field of time series forecasting technology and discloses a stock index forecasting method based on 1DCNN-ResBiLSTM, comprising the following steps: S1, data acquisition and sample construction; S2, data preprocessing; S3, local feature extraction; S4, long-term dependency learning; S5, residual feature fusion: extracting the feature vector of the last time step output by the 1DCNN module in S3, linearly mapping it to the same dimension as the output feature of S4 through a fully connected layer, and then adding it element-wise with the output feature of S4 to achieve residual feature fusion; S6, prediction result output; S7, model training and optimization. By adopting a technical solution that combines the 1DCNN module with time-dimensional sliding convolution and ReLU nonlinear activation, the method accurately captures the local fluctuation features and short-term trend information of stock time series, solving the problems of insufficient characterization of short-term stock price changes and inability to effectively mine local time series patterns.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of time series forecasting technology, specifically to a stock index forecasting method based on 1DCNN-ResBiLSTM. Background Technology

[0002] Stock index prediction is a crucial technological area in financial data mining and quantitative investment. Its core principle is to predict future stock price trends by analyzing time-series patterns in historical stock trading data, providing technical support for investor decision-making, risk management, and financial market analysis. Currently, stock index prediction technologies are widely applied in various financial scenarios. Related prediction models primarily achieve predictions by extracting features from stock time series data and modeling time-series dependencies. The core objective is to improve the accuracy and stability of predictions and adapt to the complex, non-stationary, and non-linear characteristics of the stock market.

[0003] Current stock index prediction technologies still have significant shortcomings, the most prominent being their weak ability to extract local features. Most existing prediction models rely solely on recurrent neural networks or traditional convolutional networks for local feature extraction. The former focuses more on time-series dependency modeling and is not accurate enough in capturing local patterns of short-term stock price fluctuations, while the latter lacks optimization design for the time dimension and cannot effectively uncover local temporal patterns in short-term stock price changes. This results in insufficient characterization of short-term price fluctuation characteristics and short-term trend information, thus affecting overall prediction accuracy. This problem has become a key bottleneck restricting the further development of stock index prediction technology. Summary of the Invention

[0004] To address the shortcomings of existing technologies, this invention provides a stock index prediction method based on 1DCNN-ResBiLSTM, which solves the problem that existing technologies rely solely on recurrent neural networks or traditional convolutional networks to extract local features, resulting in insufficient characterization of short-term stock price changes and an inability to effectively mine local time-series patterns.

[0005] To achieve the above objectives, the present invention provides the following technical solution: a stock index prediction method based on 1DCNN-ResBiLSTM, comprising the following steps:

[0006] S1, Data Acquisition and Sample Construction: Acquire multi-dimensional market feature data of historical stock transactions, sort them by time, and construct supervised learning samples using the sliding window method. Use the multi-dimensional feature data in the window as the model input, and the closing price of the stock on the next trading day as the prediction target.

[0007] S2, Data Preprocessing: Min-Max normalization is applied to the model input features and prediction targets obtained in S1 to map the feature values ​​to a specified numerical range.

[0008] S3, Local Feature Extraction: Input the normalized time series data from S2 into the 1DCNN module to extract the local variation features of the stock time series and output the feature representation of the specified dimension.

[0009] S4, Long-term Dependency Learning: Input the output features of S3 into a stacked BiLSTM module to learn the long-term dependencies of the stock time series, select the hidden state of the last time step of the series as the feature representation, and output the feature representation of the specified dimension.

[0010] S5, Residual Feature Fusion: Extract the feature vector of the last time step output by the 1DCNN module in S3, linearly map it to the same dimension as the output feature of S4 through a fully connected layer, and then add it element by element with the output feature of S4 to achieve residual feature fusion;

[0011] S6, Prediction Result Output: Input the fused features from S5 into the fully connected layer for regression prediction, and output the predicted closing price of the stock for the next trading day;

[0012] S7, Model Training and Optimization: Using mean squared error as the loss function, the model parameters are iteratively updated through the Adam optimization algorithm to complete model training and optimization.

[0013] By adopting the above technical solution, the 1DCNN module is used to extract local features, the BiLSTM module is used to learn long-term dependencies, and the residual mechanism is used to fuse dual features. Combined with standardized data processing and model optimization strategies, the synergistic expression of local features and long-term time series features is achieved, avoiding the decay of information in deep networks. Therefore, the accuracy and stability of stock index prediction are improved, and the prediction is adapted to complex financial time series prediction scenarios.

[0014] Preferably, in step S1, obtaining multi-dimensional market characteristic data of historical stock transactions includes:

[0015] The opening price, highest price, lowest price, closing price, and trading volume in the stock trading market are collected as core multi-dimensional market features;

[0016] The raw data of the core multidimensional market features are arranged in an orderly manner according to the chronological order of stock transactions;

[0017] A fixed-length sliding window is set, and the multidimensional feature data within the window is used as the model input, with the stock closing price of the next trading day as the prediction target, to construct supervised learning samples.

[0018] By adopting the above technical solution, the stock trading status is comprehensively represented by multi-dimensional core market features, and the data is arranged in chronological order to ensure the timeliness of the data. The sliding window method transforms non-stationary time-series data into stationary supervisory samples. Therefore, standardized and structured input data is obtained for subsequent model training, ensuring the effectiveness of subsequent processing.

[0019] Preferably, in step S2, the Min-Max normalization processing of the model input features and the prediction target includes:

[0020] Statistically analyze the extreme values ​​of feature data and predicted target data within the dataset for each dimension;

[0021] Based on the extreme values, the original feature values ​​and the predicted target values ​​are uniformly mapped to the [0,1] numerical range;

[0022] All normalized data are validated to ensure that there are no outliers or values ​​outside the range.

[0023] By adopting the above technical solution, Min-Max normalization is used to eliminate the dimensional differences between different features and unify the data range. At the same time, abnormal data is removed through verification. Therefore, the effects of avoiding gradient imbalance during model training, accelerating convergence speed, and ensuring the validity and standardization of input data are achieved.

[0024] Preferably, in step S3, inputting the normalized time series data into the 1DCNN module to extract local variation features includes:

[0025] A 1DCNN feature extraction module consisting of one-dimensional convolutional layers and ReLU non-linear activation function layers is constructed.

[0026] Normalized time-series data is input into a one-dimensional convolutional layer, and local temporal features are captured through sliding convolution in the time dimension.

[0027] The output of the convolutional layer is input into the ReLU activation function layer for nonlinear transformation, and a 32-dimensional local feature representation is output.

[0028] By adopting the above technical solution, the one-dimensional convolutional layer captures local fluctuation patterns in the time dimension, the ReLU activation function introduces nonlinear factors, and the output feature dimension is standardized. Therefore, it can accurately extract short-term fluctuation features and local trends of stocks, solve the problem that linear models cannot characterize complex financial relationships, and provide effective input for subsequent long-term learning.

[0029] Preferably, in step S4, learning long-term dependencies by inputting local features into the stacked BiLSTM module includes:

[0030] A two-layer stacked bidirectional BiLSTM network module is constructed, with 32-dimensional local features as the module input;

[0031] The temporal dependencies of stock time series are learned synchronously through LSTM networks in both the forward and reverse directions.

[0032] Select the hidden state of the last time step of the BiLSTM network sequence and output a 128-dimensional long-term temporal feature representation.

[0033] By adopting the above technical solution, the bidirectional structure is used to fully capture the forward and backward dependencies of time series data, the two-layer stacked architecture is used to mine deep long-term features, and the output dimension is standardized to adapt to the subsequent fusion process. Therefore, the model can fully learn the long-term dynamic trend of stock time series, avoid the gradient vanishing of traditional recurrent networks, and improve the model's time series modeling ability.

[0034] Preferably, the two-layer stacked bidirectional BiLSTM network module includes:

[0035] Set the number of hidden layer units in the BiLSTM network to 64 and enable the bidirectional learning structure;

[0036] A two-layer stacked network architecture is adopted to extract deep long-term dependency features of stock time series layer by layer;

[0037] The input feature dimension of the matching module is the same as the output feature dimension of the 1DCNN module to ensure the consistency of data transmission.

[0038] By adopting the above technical solutions, the BiLSTM module achieves improved long-term dependency learning efficiency and accuracy, and ensures smooth connection between all parts of the model. This is due to the reasonable number of hidden layer units to ensure feature representation capability, the bidirectional structure to improve temporal dependency capture, the two-layer stacking to deepen feature mining, and the dimension matching to ensure smooth data transmission.

[0039] Preferably, in step S5, constructing the residual feature fusion module to achieve feature fusion includes:

[0040] Extract the 32-dimensional local feature vector from the last time step of the 1DCNN module output features;

[0041] The 32-dimensional local feature vectors are linearly mapped to 128 dimensions through a fully connected layer, matching the output dimension of the BiLSTM module.

[0042] The mapped local features are added element-wise to the long-term temporal features output by BiLSTM to complete the residual feature fusion.

[0043] By adopting the above technical solution, due to the use of fully connected layers to achieve dimension matching, the residuals are added element by element to fuse local and long-term features, and the core feature information is preserved. Therefore, the effect of avoiding information decay in deep networks, achieving collaborative expression of local and long-term features, and improving the model's feature expression capability is obtained.

[0044] Preferably, in step S6, inputting the fused features into the fully connected layer for regression prediction includes:

[0045] A single-output fully connected regression layer is constructed, with 128-dimensional residual fusion features used as the layer input;

[0046] The high-dimensional fused features are mapped to a one-dimensional feature space through linear transformation of the fully connected layer;

[0047] The output value of the one-dimensional feature space is used as the prediction result of the stock closing price for the next trading day.

[0048] By adopting the above technical solution, the single-output regression layer is used to adapt to the closing price prediction requirements, and the linear transformation realizes the conversion of high-dimensional features into predicted values, accurately outputting a single prediction result. Therefore, the effect of converting fused features into specific closing price prediction values ​​is achieved, completing the mapping from feature space to prediction value space, and providing a basis for model training and optimization is obtained.

[0049] Preferably, in step S7, model training using mean squared error as the loss function includes:

[0050] The mean squared error is calculated based on the deviation between the model's predicted value and the actual closing price, and is used as the loss function for model training.

[0051] The Adam optimization algorithm is used to iteratively update all network parameters of the 1DCNN-ResBiLSTM model;

[0052] By iteratively performing forward and backward propagation, the loss function value is gradually reduced.

[0053] By adopting the above technical solution, the prediction bias is accurately quantified by the mean square error, the Adam optimization algorithm achieves efficient parameter updates, and the forward and backward propagation work together to iterate and reduce the bias. Therefore, the model prediction effect is accurately measured, the model training convergence speed is accelerated, and the model training stability is improved.

[0054] Preferably, in step S7, the model training and optimization using the Adam optimization algorithm includes:

[0055] Historical stock trading data was selected as the dataset for model training.

[0056] The dataset is divided into training and testing sets according to a certain ratio, which are used for model training and performance verification, respectively.

[0057] Iterate through the training process until the loss function value stabilizes and the model converges, completing the training and optimization of the entire model.

[0058] By adopting the above technical solutions, the training effectiveness is guaranteed by using real historical stock data, the training and validation are separated by reasonable dataset partitioning, and the convergence determination ensures the stability of model performance. Therefore, the generalization ability of the model is improved, the training effect is guaranteed, and the trained model can be adapted to actual stock index prediction scenarios.

[0059] This invention provides a stock index prediction method based on 1DCNN-ResBiLSTM. It has the following beneficial effects:

[0060] 1. This invention achieves the technical effect of accurately capturing local fluctuation features and short-term trend information of stock time series by adopting a 1DCNN module combined with time-dimensional sliding convolution and ReLU nonlinear activation. Compared with the existing technology that simply relies on recurrent neural networks or traditional convolutional networks to extract local features, this invention solves the problem that it does not fully characterize short-term stock price changes and cannot effectively mine local time series patterns.

[0061] 2. The present invention adopts a two-layer stacked bidirectional BiLSTM module technical solution, which achieves the technical effect of comprehensively and fully capturing the long-term dependency relationship and dynamic trend of stock time series. Compared with the existing technical solutions of unidirectional LSTM or single-layer recurrent neural network modeling long-term time series, it solves the problems of incomplete capture of forward and backward dependency relationship of stock time series data and insufficient mining of deep long-term features.

[0062] 3. This invention adopts a residual feature fusion mechanism, which realizes the element-wise fusion of local features of 1DCNN and long-term temporal features of BiLSTM through dimensional mapping of fully connected layers. This achieves the technical effect of avoiding information decay in deep networks and realizing the collaborative expression of local and long-term features. Compared with the existing technology, which lacks an effective feature fusion mechanism or simply splices features, this invention solves the problems of weak feature expression ability and inability to take into account both local and long-term temporal features. Attached Figure Description

[0063] Figure 1 This is a schematic diagram illustrating the steps of a stock index prediction method based on 1DCNN-ResBiLSTM according to the present invention. Detailed Implementation

[0064] The technical solution of the present invention will now be clearly and completely described with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0065] Please see the appendix Figure 1 This invention provides a stock index prediction method based on 1DCNN-ResBiLSTM, comprising the following steps:

[0066] S1, Data Acquisition and Sample Construction: Acquire multi-dimensional market feature data of historical stock transactions, sort them by time, and construct supervised learning samples using the sliding window method. Use the multi-dimensional feature data in the window as the model input, and the closing price of the stock on the next trading day as the prediction target.

[0067] Furthermore, in S1, the acquisition of multi-dimensional market characteristic data of historical stock transactions includes:

[0068] The opening price, highest price, lowest price, closing price, and trading volume in the stock trading market are collected as core multi-dimensional market features;

[0069] The raw data of core multidimensional market characteristics are arranged in an orderly manner according to the chronological order of stock transactions;

[0070] A fixed-length time sliding window is set, and the multidimensional feature data within the window is used as the model input, with the stock closing price of the next trading day as the prediction target, to construct supervised learning samples;

[0071] Specifically, the selected opening price, highest price, lowest price, and closing price directly reflect the stock price fluctuation characteristics, while the trading volume reflects the market trading activity. The combination of the two can fully characterize the market state of stock trading. Arranging them in chronological order can ensure the time sequence of the data, which meets the requirements of financial time series analysis. The set fixed-length time sliding window can transform non-stationary stock time series into a stationary sample set for supervised learning. Each sample is a multivariate time series with a time step length × 5. This sample set can provide standardized and structured input data for subsequent data preprocessing, ensuring that the preprocessing stage can process time series data of the same dimension in batches.

[0072] S2, Data Preprocessing: Min-Max normalization is applied to the model input features and prediction targets obtained in S1 to map the feature values ​​to a specified numerical range.

[0073] Furthermore, in S2, the model input features and prediction targets are processed using Min-Max normalization, including:

[0074] Statistically analyze the extreme values ​​of feature data and predicted target data within the dataset for each dimension;

[0075] Based on the extreme values, the original feature values ​​and the predicted target values ​​are uniformly mapped to the [0,1] numerical range;

[0076] Validate all normalized data to ensure that there are no outliers or values ​​outside the range.

[0077] Specifically, for the multivariate time series samples constructed in S1, since the dimensions of price features such as opening price and highest price differ significantly from those of trading volume features, directly inputting them into the model would lead to gradient imbalance and slow convergence during training. Therefore, Min-Max normalization is used to uniformly process all feature dimensions and prediction targets. The calculation formula is as follows:

[0078] ;

[0079] in, The original data in the sample obtained from S1, This represents the minimum value of the corresponding feature dimension across the entire dataset. This represents the maximum value of the corresponding feature dimension across the entire dataset. The normalized values ​​are mapped to the [0,1] interval using this formula. Verification of the normalized data can eliminate outliers that cause data to exceed the interval, ensuring the validity of the data. The preprocessed time series data eliminates the difference in units, providing a stable and standardized input for the local feature extraction of the subsequent 1DCNN module, and improving the accuracy of convolution operations in capturing local time series features.

[0080] S3, Local Feature Extraction: Input the normalized time series data from S2 into the 1DCNN module to extract the local variation features of the stock time series and output the feature representation of the specified dimension.

[0081] Furthermore, in S3, the normalized time series data is input into the 1DCNN module to extract local variation features, including:

[0082] A 1DCNN feature extraction module consisting of one-dimensional convolutional layers and ReLU non-linear activation function layers is constructed.

[0083] Normalized time-series data is input into a one-dimensional convolutional layer, and local temporal features are captured through sliding convolution in the time dimension.

[0084] The output of the convolutional layer is input into the ReLU activation function layer for nonlinear transformation, and a 32-dimensional local feature representation is output.

[0085] Specifically, for the normalized time-series data output by S2, the sliding characteristic of a one-dimensional convolutional layer in the time dimension is utilized to capture the fluctuation patterns and local trend features of stock prices in a short period of time using local receptive fields. The number of input channels of the one-dimensional convolutional layer is matched with the 5-dimensional market features, and the number of output channels is set to 32. Efficient extraction of local features is achieved through weight sharing of the convolutional kernel, effectively reducing the number of model parameters. The data after the convolution operation still retains its temporal sequence. Then, a nonlinear transformation is performed using the ReLU activation function, the function expression of which is:

[0086] ;

[0087] v represents the input value of the ReLU activation function, that is, the output feature value of a one-dimensional convolutional layer; This represents the output value of the ReLU activation function;

[0088] This operation can introduce nonlinear factors to solve the problem that linear models cannot characterize the complex nonlinear relationships in the financial market. It maps features into 32-dimensional local feature representations, which not only completes the extraction and dimensionality increase of local features, but also provides high-dimensional and effective local feature inputs for the subsequent BiLSTM module to learn long-term temporal dependencies, ensuring the continuous transmission of temporal features.

[0089] S4, Long-term Dependency Learning: Input the output features of S3 into a stacked BiLSTM module to learn the long-term dependencies of the stock time series, select the hidden state of the last time step of the series as the feature representation, and output the feature representation of the specified dimension.

[0090] Furthermore, in S4, the BiLSTM modules that input local features into the stacked structure learn long-term dependencies, including:

[0091] A two-layer stacked bidirectional BiLSTM network module is constructed, with 32-dimensional local features as the module input;

[0092] The temporal dependencies of stock time series are learned synchronously through LSTM networks in both the forward and reverse directions.

[0093] Select the hidden state of the last time step of the BiLSTM network sequence and output a 128-dimensional long-term temporal feature representation;

[0094] The construction of a two-layer stacked bidirectional BiLSTM network module includes:

[0095] Set the number of hidden layer units in the BiLSTM network to 64 and enable the bidirectional learning structure;

[0096] A two-layer stacked network architecture is adopted to extract deep long-term dependency features of stock time series layer by layer;

[0097] The input feature dimension of the matching module is the same as the output feature dimension of the 1DCNN module to ensure the consistency of data transmission.

[0098] Specifically, for the 32-dimensional local features output by S3, the gating mechanism of the BiLSTM network is used to solve the gradient vanishing problem of traditional recurrent neural networks. The forward LSTM learns features from the past to the future of the time series, and the backward LSTM learns features from the future to the past of the time series. The bidirectional structure can completely capture the forward and backward long-term dependencies of stock time series data. The number of hidden layer units is set to 64. The bidirectional structure makes the output feature dimension of a single time step 128. The two-layer stacked network architecture can explore deeper long-term time series features layer by layer, improving the model's ability to learn complex financial time series patterns. The hidden state of the last time step is selected as the feature representation of the entire time window. The long-term features of the entire time series window can be condensed into a fixed-dimensional feature vector. This 128-dimensional long-term time series feature not only integrates the local feature base extracted by S3, but also realizes the effective learning of long-term dependencies, providing deep time series feature input for subsequent residual feature fusion.

[0099] S5, Residual Feature Fusion: Extract the feature vector of the last time step output by the 1DCNN module in S3, linearly map it to the same dimension as the output feature of S4 through a fully connected layer, and then add it element by element with the output feature of S4 to achieve residual feature fusion;

[0100] Furthermore, in S5, the residual feature fusion module is constructed to achieve feature fusion, including:

[0101] Extract the 32-dimensional local feature vector from the last time step of the 1DCNN module output features;

[0102] The 32-dimensional local feature vectors are linearly mapped to 128 dimensions through a fully connected layer, matching the output dimension of the BiLSTM module.

[0103] The mapped local features are added element-wise to the long-term temporal features output by BiLSTM to complete the residual feature fusion.

[0104] Specifically, regarding the 32-dimensional local features output by S3 and the 128-dimensional long-term time-series features output by S4, to address the information decay problem during deep network training and simultaneously achieve the collaborative representation of local and long-term features, the 32-dimensional local feature vector from the last time step of the 1DCNN module in S3 is first extracted. This vector contains the core short-term fluctuation information of the stock time-series data. Then, a fully connected layer is used for linear mapping to transform the 32-dimensional feature vector to 128 dimensions. The mapping process is as follows:

[0105] ;

[0106] in, It is a 32-dimensional local feature vector. Here is the weight matrix of the fully connected layer. For bias terms, The mapped 128-dimensional feature vector is made so that the local feature dimension is consistent with the long-term time series feature dimension output by S4. Then, the mapped local features and long-term time series features are added element by element to achieve residual feature fusion. The fused features retain both the local fluctuation features and long-term dependency features of the stock time series data, effectively improving the feature expression ability of the model and providing a more comprehensive and effective fused feature input for subsequent regression prediction.

[0107] S6, Prediction Result Output: Input the fused features from S5 into the fully connected layer for regression prediction, and output the predicted closing price of the stock for the next trading day;

[0108] Furthermore, in S6, the process of inputting fused features into the fully connected layer for regression prediction includes:

[0109] A single-output fully connected regression layer is constructed, with 128-dimensional residual fusion features used as the layer input;

[0110] The high-dimensional fused features are mapped to a one-dimensional feature space through linear transformation of the fully connected layer;

[0111] The output value of the one-dimensional feature space is used as the prediction result of the stock closing price for the next trading day;

[0112] Specifically, for the 128-dimensional residual fusion feature output by S5, which integrates local and long-term features of stock time series data, a single-output fully connected regression layer is constructed. A linear transformation maps the high-dimensional fusion feature to a one-dimensional feature space. The transformation process is as follows:

[0113] ;

[0114] in, It is a 128-dimensional fusion feature. Here is the weight matrix of the regression layer. For the bias term of the regression layer, The one-dimensional output value is the predicted closing price of the stock on the next trading day. The fully connected regression layer can transform the high-dimensional fusion features into specific price prediction values, completing the mapping from the feature space to the prediction value space. This prediction value provides a reference for the prediction results during model training, and also provides the core basis for loss calculation in subsequent model training and optimization.

[0115] S7, Model Training and Optimization: Using mean squared error as the loss function, the model parameters are iteratively updated through the Adam optimization algorithm to complete model training and optimization;

[0116] Furthermore, in S7, model training using mean squared error as the loss function includes:

[0117] The mean squared error is calculated based on the deviation between the model's predicted value and the actual closing price, and is used as the loss function for model training.

[0118] The Adam optimization algorithm is used to iteratively update all network parameters of the 1DCNN-ResBiLSTM model;

[0119] By iteratively performing forward and backward propagation, the loss function value is gradually reduced.

[0120] In S7, model training and optimization using the Adam optimization algorithm include:

[0121] Historical stock trading data was selected as the dataset for model training.

[0122] The dataset is divided into training and testing sets according to a certain ratio, which are used for model training and performance verification, respectively.

[0123] Iterate through the training process until the loss function value stabilizes and the model converges, thus completing the training and optimization of the entire model.

[0124] Specifically, for the predicted stock closing price output by S6, historical trading data of the CSI 300 since its listing is selected as the dataset. This dataset has the same data source as the sample construction data of S1, ensuring data consistency and validity. The dataset is divided into training and test sets in an 8:2 ratio. The training set is used for iterative updates of model parameters, and the test set is used to verify the generalization ability of the model. The mean squared error is used as the loss function to measure the deviation between the predicted value and the true value. The calculation formula is as follows:

[0125] ;

[0126] in, For the sample size, This represents the true value of the stock's closing price. The predicted value output by S6 is obtained through forward propagation and the loss value is calculated. Then, the loss value is passed to all network parameters of 1DCNN, BiLSTM, residual fusion and fully connected layers through backpropagation. The parameters are iteratively updated using the Adam optimization algorithm. This algorithm combines the advantages of momentum method and adaptive learning rate method, which can accelerate the convergence speed of the model and improve the training stability.

[0127] Through repeated forward and backward propagation, the loss function value is gradually reduced. When the loss function values ​​of both the training set and the test set tend to stabilize and do not decrease significantly, the model is considered to have converged, and the model training and optimization are completed. The trained model can achieve high-precision closing price prediction based on new stock time series data, and can be effectively applied to stock index prediction scenarios.

[0128] The following is a further description with reference to the embodiments:

[0129] The core dataset for this experiment is historical stock trading data from the listing of the CSI 300 Index to the date of the experiment. The data dimensions are five multi-dimensional market features of stock trading: opening price, highest price, lowest price, closing price, and trading volume. A number of valid continuous trading data records were obtained from financial trading databases. The dataset is divided into training set and test set in a conventional ratio of 8:2. The training set is used for model parameter training and iterative optimization, while the test set is used for model performance verification and effect comparison. The training set and test set have no data overlap to avoid data leakage issues that may affect the accuracy of model training and verification.

[0130] In this embodiment, the construction of the 1DCNN-ResBiLSTM model is completed strictly according to the technical solution steps. The parameter settings, dimension transformations, and operation methods of each module all follow the core technical content of the disclosure document, including the following steps:

[0131] S1, Sample construction parameters: The time sliding window length is set to 30 trading days, that is, the multidimensional feature data of the previous 30 trading days is used as the model input to predict the stock closing price of the next trading day. Each input sample is a 30×5 multivariate time series, where 30 represents the time step length and 5 represents the feature dimensions of opening price, highest price, lowest price, closing price and trading volume.

[0132] S2, Data preprocessing operation: The Min-Max normalization method is used to map each feature value and the predicted target value to the [0,1] interval. The normalized value is calculated by the minimum and maximum values ​​of the original data and the corresponding features to eliminate the difference in units and improve the stability and convergence speed of model training.

[0133] S3, 1DCNN Module Construction: A one-dimensional convolutional feature extraction module consisting of a one-dimensional convolutional layer and a ReLU non-linear activation function layer is constructed. The precise parameters of the convolutional layer are set as follows: 5 input channels, 32 output channels, 3 kernels, and padding=1. The convolution operation slides along the time dimension, capturing short-term fluctuation patterns and local feature information in the stock price sequence through the local receptive field. The output of the convolutional layer is non-linearly transformed by the ReLU activation function, ultimately mapping the input feature dimension from the original 5-dimensional to a 32-dimensional feature representation.

[0134] S4, BiLSTM Module Construction: Construct a BiLSTM module with a two-layer stacked bidirectional LSTM structure. The parameters are precisely set as follows: input feature dimension 32, number of hidden layer units 64, number of network layers 2, and bidirectional structure enabled. The bidirectional LSTM learns sequence information from both the forward and reverse directions of the time series simultaneously, fully capturing the long-term dependencies and dynamic trends in stock price changes. Due to the enabled bidirectional structure, the feature dimension of the LSTM output is 64×2=128. Then, the hidden state of the last time step of the sequence is selected as the feature representation of the entire 30-trading-day time window.

[0135] S5, Residual Feature Fusion Module Construction: Extract the feature vector of the last time step from the 32-dimensional features output by the one-dimensional convolutional module, perform linear mapping through the fully connected layer Linear32→128, map the feature to the same 128-dimensional dimension as the output of BiLSTM, and then add the mapping result to the 128-dimensional features output by the BiLSTM module element by element to achieve residual feature fusion, effectively fusing shallow convolutional features and deep temporal features;

[0136] S6, Prediction layer construction: Construct a fully connected layer Linear128→1, input the fused 128-dimensional features into this fully connected layer for regression prediction, and directly output the predicted value of the stock closing price for the next trading day;

[0137] S7, Basic parameters for model training: Set the batch size for model training to 64, the number of training epochs to 100, and the learning rate of the Adam optimization algorithm to 0.001; use mean squared error as the loss function, measure the model prediction deviation by calculating the mean deviation between the predicted value and the true value, and use the Adam optimization algorithm to iteratively update all network parameters of the model.

[0138] The preprocessed 30×5-dimensional training set samples are input into the 1DCNN-ResBiLSTM model. The 1DCNN module extracts 32-dimensional local features, which are then input into the BiLSTM module to obtain 128-dimensional long-term temporal features. After residual fusion, 128-dimensional fused features are obtained. Finally, the predicted stock closing price is output through the fully connected prediction layer to complete the forward propagation of the model.

[0139] Based on the forward propagation prediction and the actual closing price, the mean square error is calculated as the loss value to quantify the model prediction bias.

[0140] Based on the loss value, the backpropagation algorithm is used in conjunction with the Adam optimization algorithm to iteratively update all network parameters of 1DCNN, BiLSTM, and fully connected layers in the model, thereby gradually reducing the loss function value.

[0141] During training, every 1 training epoch, the preprocessed test set samples are input into the model for performance verification. When the loss value of the test set does not decrease significantly and tends to stabilize after 10 consecutive training epochs, the model is considered to have converged, and training is stopped immediately, resulting in the completed 1DCNN-ResBiLSTM stock index prediction model.

[0142] To verify the predictive performance of the 1DCNN-ResBiLSTM model of this invention, comparative experiments were conducted with several typical prediction models in the field of financial time series forecasting under the same dataset, training, and validation conditions. The comparison models included convolutional neural networks, long short-term memory networks, recurrent neural networks, and Transformer models. Mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), and coefficient of determination (COP) were used as unified model evaluation metrics. Smaller MAE, RMSE, and MAPE values ​​indicate smaller prediction errors. 2 The closer a value is to 1, the better the model fits the data and the more accurately it can capture the changing patterns of stock time series.

[0143] Table 1 shows the final results of this comparative experiment:

[0144]

[0145] The 1DCNN-ResBiLSTM model significantly outperforms other comparative models across all error metrics. Specifically, MAE is reduced by 40.3% compared to RNN, 43% compared to LSTM, 58.4% compared to CNN, and 59.1% compared to Transformer; RMSE is reduced by 35% compared to RNN, 38.5% compared to LSTM, 55% compared to CNN, and 50.5% compared to Transformer; and MAPE is reduced by 36.2% compared to RNN, 43.4% compared to LSTM, 57.3% compared to CNN, and 59.3% compared to Transformer. This fully demonstrates that the prediction error of the model in this invention is significantly reduced, and the predicted value of stock closing price is closer to the true value.

[0146] In the coefficient of determination R 2 The highest value of 0.9811 was achieved, which is much higher than CNN (0.9058), LSTM (0.9493), Transformer (0.9219), and RNN (0.9549). This indicates that the model of this invention has the best fitting effect on stock time series data, and can more effectively capture the complex dynamic features in stock time series and accurately depict the changing patterns of stock prices.

[0147] Comparative experiments fully validate that the synergistic effect of 1DCNN's local feature extraction, BiLSTM's long-term dependency learning, and residual connection's feature fusion capability solves the technical defects of existing models and significantly improves the accuracy and stability of stock index prediction.

[0148] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A stock index prediction method based on 1DCNN-ResBiLSTM, characterized in that, Includes the following steps: S1, Data Acquisition and Sample Construction: Acquire multi-dimensional market feature data of historical stock transactions, sort them by time, and construct supervised learning samples using the sliding window method. Use the multi-dimensional feature data in the window as the model input, and the closing price of the stock on the next trading day as the prediction target. S2, Data Preprocessing: Min-Max normalization is applied to the model input features and prediction targets obtained in S1 to map the feature values ​​to a specified numerical range. S3, Local Feature Extraction: Input the normalized time series data from S2 into the 1DCNN module to extract the local variation features of the stock time series and output the feature representation of the specified dimension. S4, Long-term Dependency Learning: Input the output features of S3 into a stacked BiLSTM module to learn the long-term dependencies of the stock time series, select the hidden state of the last time step of the series as the feature representation, and output the feature representation of the specified dimension. S5, Residual Feature Fusion: Extract the feature vector of the last time step output by the 1DCNN module in S3, linearly map it to the same dimension as the output feature of S4 through a fully connected layer, and then add it element by element with the output feature of S4 to achieve residual feature fusion; S6, Prediction Result Output: Input the fused features from S5 into the fully connected layer for regression prediction, and output the predicted closing price of the stock for the next trading day; S7, Model Training and Optimization: Using mean squared error as the loss function, the model parameters are iteratively updated through the Adam optimization algorithm to complete model training and optimization.

2. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S1, obtaining multi-dimensional market characteristic data of historical stock transactions includes: The opening price, highest price, lowest price, closing price, and trading volume in the stock trading market are collected as core multi-dimensional market features; The raw data of the core multidimensional market features are arranged in an orderly manner according to the chronological order of stock transactions; A fixed-length sliding window is set, and the multidimensional feature data within the window is used as the model input, with the stock closing price of the next trading day as the prediction target, to construct supervised learning samples.

3. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S2, the Min-Max normalization process is applied to both the model input features and the prediction target, including: Statistically analyze the extreme values ​​of feature data and predicted target data within the dataset for each dimension; Based on the extreme values, the original feature values ​​and the predicted target values ​​are uniformly mapped to the [0,1] numerical range; All normalized data are validated to ensure that there are no outliers or values ​​outside the range.

4. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S3, inputting the normalized time series data into the 1DCNN module to extract local variation features includes: A 1DCNN feature extraction module consisting of one-dimensional convolutional layers and ReLU non-linear activation function layers is constructed. Normalized time-series data is input into a one-dimensional convolutional layer, and local temporal features are captured through sliding convolution in the time dimension. The output of the convolutional layer is input into the ReLU activation function layer for nonlinear transformation, and a 32-dimensional local feature representation is output.

5. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S4, learning long-term dependencies by inputting local features into the stacked BiLSTM module includes: A two-layer stacked bidirectional BiLSTM network module is constructed, with 32-dimensional local features as the module input; The temporal dependencies of stock time series are learned synchronously through LSTM networks in both the forward and reverse directions. Select the hidden state of the last time step of the BiLSTM network sequence and output a 128-dimensional long-term temporal feature representation.

6. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 5, characterized in that, The two-layer stacked bidirectional BiLSTM network module includes: Set the number of hidden layer units in the BiLSTM network to 64 and enable the bidirectional learning structure; A two-layer stacked network architecture is adopted to extract deep long-term dependency features of stock time series layer by layer; The input feature dimension of the matching module is the same as the output feature dimension of the 1DCNN module to ensure the consistency of data transmission.

7. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S5, constructing the residual feature fusion module to achieve feature fusion includes: Extract the 32-dimensional local feature vector from the last time step of the 1DCNN module output features; The 32-dimensional local feature vectors are linearly mapped to 128 dimensions through a fully connected layer, matching the output dimension of the BiLSTM module. The mapped local features are added element-wise to the long-term temporal features output by BiLSTM to complete the residual feature fusion.

8. The stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S6, inputting the fused features into the fully connected layer for regression prediction includes: A single-output fully connected regression layer is constructed, with 128-dimensional residual fusion features used as the layer input; The high-dimensional fused features are mapped to a one-dimensional feature space through linear transformation of the fully connected layer; The output value of the one-dimensional feature space is used as the prediction result of the stock closing price for the next trading day.

9. A stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S7, model training using mean squared error as the loss function includes: The mean squared error is calculated based on the deviation between the model's predicted value and the actual closing price, and is used as the loss function for model training. The Adam optimization algorithm is used to iteratively update all network parameters of the 1DCNN-ResBiLSTM model; By iteratively performing forward and backward propagation, the loss function value is gradually reduced.

10. A stock index prediction method based on 1DCNN-ResBiLSTM according to claim 1, characterized in that, In step S7, model training and optimization using the Adam optimization algorithm include: Historical stock trading data was selected as the dataset for model training. The dataset is divided into training and testing sets according to a certain ratio, which are used for model training and performance verification, respectively. Iterate through the training process until the loss function value stabilizes and the model converges, completing the training and optimization of the entire model.