A fund income prediction method and system supporting high-frequency trading

By constructing dynamic financial causal relationship graphs and time series graph convolutional coding, the shortcomings of fund return prediction in high-frequency trading environment are solved, the changes in market microstructure and information transmission are captured, the prediction effect is improved, and quantitative decision-making for risk management is provided.

CN122288884APending Publication Date: 2026-06-26QINGDAO SANCHUANG INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
QINGDAO SANCHUANG INFORMATION TECH CO LTD
Filing Date
2026-04-10
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing fund return forecasting methods struggle to capture instantaneous market disturbances caused by breaking news, policy changes, or social media sentiment in high-frequency trading environments, and lack mechanisms to embed events into financial knowledge systems for causal deduction, resulting in poor forecasting performance.

Method used

A dynamic financial causal relationship graph is constructed. Through multimodal high-frequency data stream acquisition and time series graph convolutional coding, fund return prediction values ​​and prediction confidence are generated, and high-frequency trading instructions are generated.

Benefits of technology

It significantly improves the depth and breadth of fund return forecasting, enhances the robustness and adaptability to market dynamics, and provides a quantitative basis for risk control and position management decisions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122288884A_ABST
    Figure CN122288884A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of financial data processing and quantitative trading technology. Specifically, it discloses a method and system for predicting fund returns supporting high-frequency trading. The specific content includes: nanosecond-level time synchronization and alignment of multi-source high-frequency data streams; construction of a dynamic financial causal relationship graph, updating causal edges between nodes in real time based on an improved transfer entropy algorithm; generation of dynamic state embedding vectors for fund entities through a time-series graph convolutional encoder; outputting future return predictions and confidence levels at the millisecond level using a cross-scale attention decoder; and generating high-frequency trading instructions accordingly. This application aims to solve the problem that existing models struggle to accurately identify complex event structures containing quantitative parameters, and facilitates the assessment of the multi-hop impact of events on the underlying assets of target funds through industry chains, competitive landscapes, or macroeconomic transmission paths.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of financial data processing and quantitative trading technology, specifically relating to a method and system for predicting fund returns that supports high-frequency trading. Background Technology

[0002] With the deep integration of artificial intelligence and fintech, fund return prediction has become a core component of quantitative investment and robo-advisory systems on fund platforms. Traditional prediction methods primarily rely on historical price series, financial statement indicators, or macroeconomic data, constructing static prediction frameworks through time series models, regression analysis, or shallow machine learning algorithms. While these methods have some effectiveness in low-frequency trading scenarios, their fundamental assumption is that the market is stable and information is fully reflected in structured data, making it difficult to capture instantaneous market disturbances caused by breaking news, policy changes, or social media sentiment. Especially in high-frequency trading environments, asset prices often respond to unstructured textual information on a timescale of seconds to minutes.

[0003] While existing fund return forecasting solutions attempt to extract key events from news or announcements to aid decision-making based on event-driven approaches, they still have certain shortcomings. Firstly, event extraction often relies on rule templates or shallow NLP models, making it difficult to accurately identify complex event structures containing quantitative parameters. Secondly, even when structured events are obtained, there is a lack of mechanisms to embed them into a financial knowledge system for causal deduction. Typically, only simple keyword matching or linear weighting is performed, failing to assess the multi-hop impact of events on the underlying assets of the target fund through industry chains, competitive landscapes, or macroeconomic transmission paths. Summary of the Invention

[0004] In view of this, in order to solve the problems mentioned in the background technology, a fund return prediction method and system that supports high-frequency trading is proposed.

[0005] The objective of this invention can be achieved through the following technical solution: The first aspect of this invention provides a fund return prediction method that supports high-frequency trading, including: S1, data stream acquisition: real-time acquisition and processing of multimodal high-frequency data streams, including exchange push data streams, financial information data streams and real-time economic data streams.

[0006] S2. Dynamic graph construction: A dynamic financial causal relationship graph is constructed based on the multimodal high-frequency data stream; the graph includes fund entity nodes, fund heavy holdings entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text; the nodes are connected by directed edges, which represent the causal influence relationship between the nodes, and the weight of the edges is dynamically updated to quantify the influence intensity.

[0007] S3. Temporal graph encoding: Temporal graph convolutional encoding is performed on the dynamic financial causal relationship graph to generate dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers.

[0008] S4. Fund return prediction: The dynamic state is embedded into a vector and input into a cross-scale attention decoder, which outputs the predicted fund return value and prediction confidence level over a specific future time span.

[0009] S5. Transaction instruction generation: Generate high-frequency trading instructions based on the fund return forecast and forecast confidence level, and send them to the exchange through the trading interface.

[0010] A second aspect of the present invention provides a fund return prediction system supporting high-frequency trading, comprising: a data stream acquisition module, a dynamic graph construction module, a time series graph encoding module, a fund return prediction module, and a trading instruction generation module.

[0011] The data stream acquisition module is connected to the dynamic graph construction module, the dynamic graph construction module is connected to the time series graph encoding module, the time series graph encoding module is connected to the fund return prediction module, and the fund return prediction module is connected to the transaction instruction generation module.

[0012] The data stream acquisition module acquires and processes multimodal high-frequency data streams in real time, including exchange push data streams, financial information data streams, and real-time economic data streams.

[0013] The dynamic graph construction module constructs a dynamic financial causal relationship graph based on the multimodal high-frequency data stream. The graph includes fund entity nodes, fund heavy holdings entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text. The nodes are connected by directed edges, which represent the causal influence relationship between the nodes, and the weight of the edges is dynamically updated to quantify the influence intensity.

[0014] The temporal graph encoding module performs temporal graph convolutional encoding on the dynamic financial causal relationship graph, and generates dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers.

[0015] The fund return prediction module embeds the dynamic state into a vector input to a cross-scale attention decoder, and outputs the predicted fund return value and prediction confidence level for a specific future time span.

[0016] The trading instruction generation module generates high-frequency trading instructions based on the fund return forecast and the forecast confidence level, and sends them to the exchange through the trading interface.

[0017] Compared with existing technologies, the beneficial effects of this invention are as follows: 1. By constructing a dynamic financial causal relationship graph, this invention elevates the fund return prediction problem from the traditional time series analysis paradigm to the network structure analysis paradigm. It can capture the return fluctuations caused by changes in market microstructure and information transmission that traditional models cannot perceive, significantly improving the depth and breadth of prediction.

[0018] 2. This invention uses a temporal graph convolutional network to encode dynamic graphs, which can adaptively learn complex and nonlinear high-frequency interaction patterns between financial entities, thereby enhancing the robustness and adaptability to market dynamic changes.

[0019] 3. This invention provides a quantitative basis for risk control and position management by offering predicted returns and confidence assessments of the predicted results generated through a specific network structure, enabling high-frequency trading strategies to conduct refined risk management while pursuing high returns. Attached Figure Description

[0020] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0021] Figure 1 This is a schematic diagram illustrating the implementation steps of the method of the present invention.

[0022] Figure 2 This is a schematic diagram of the system module connections of the present invention. Detailed Implementation

[0023] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0024] Example 1

[0025] Please see Figure 1 As shown, the present invention provides a fund return prediction method that supports high-frequency trading. The specific steps are as follows: S1, data stream acquisition: real-time acquisition and processing of multimodal high-frequency data streams, including exchange push data streams, financial information data streams and real-time economic data streams.

[0026] In one feasible embodiment of the present invention, the transaction push data stream includes order data, transaction data, and buy / sell order book depth snapshot data.

[0027] In one specific example, the order data includes, but is not limited to, the order ID, stock code, order time, buy / sell direction, order price, order quantity, and order status such as new order, canceled order, and completed order.

[0028] The transaction data includes, but is not limited to, the transaction ID, stock code, transaction time, transaction price, transaction quantity, buyer's order ID, and seller's order ID for each transaction order.

[0029] The buy / sell order book depth snapshot data includes, but is not limited to, stock code, snapshot time, order prices and cumulative quantities at five or ten levels for both buy and sell, and total buy / sell order volume.

[0030] The financial information data stream includes real-time news text data and social media sentiment index data.

[0031] One specific example is that the real-time news text data includes, but is not limited to, news ID, publication time, information source, news title, news body, associated stock code / industry / keyword, and sentiment polarity.

[0032] The social media sentiment metrics data include, but are not limited to, platform, stock code, aggregation time window, sentiment index such as the proportion of bearish sentiment, discussion volume, and key opinion leader's speech summary.

[0033] The real-time economic data stream includes published indicators and key parameters.

[0034] One specific example is that the published indicators include, but are not limited to, the Consumer Price Index, the Purchasing Managers' Index, industrial value added, and interest rate decisions.

[0035] The key parameters include, but are not limited to, data name, release time, release value, market expectation value, and previous value.

[0036] In one feasible embodiment of the present invention, the specific process of processing multimodal high-frequency data streams includes: attaching a timestamp generated by a hardware time synchronization device to each accessed data record, wherein the hardware time synchronization device is synchronized with the national time service center standard time source through a precise time protocol.

[0037] For example, the hardware time synchronization device is a PCIe 4.0 hardware clock card based on the IEEE 1588v2 standard, with a time synchronization error of ≤30 nanoseconds and a timestamp generation delay of ≤8 nanoseconds; and the device is connected to the data stream receiving server through the PCIe 4.0 interface, accesses the PTP master clock signal of the National Time Service Center through single-mode optical fiber, and achieves clock synchronization by using an end-to-end delay measurement method.

[0038] The precise time protocol adopts the IEEE 1588v2-2008 version and is configured with a master-slave clock architecture. The master clock is a primary clock source provided by the National Time Service Center, and the slave clock is the aforementioned hardware clock card. The protocol's delay measurement period is set to 1 millisecond, the timestamp is captured by hardware, and the clock level is set to the temperature-controlled crystal oscillator level to ensure stable nanosecond-level synchronization accuracy.

[0039] Multimodal high-frequency data streams with timestamps are fed into a multi-producer single-consumer data bus based on a lock-free circular buffer to achieve non-blocking data sinking.

[0040] For example, the capacity of the lock-free circular buffer is designed based on the peak rate of high-frequency data streams, such as 500,000 order data entries / second or 100,000 news data entries / second for 10 seconds. The buffer depth is 2GB, and the data blocks are divided into fixed lengths of 128 bytes. A lock-free queue algorithm based on CAS operation is used to achieve concurrent writing by multiple producers.

[0041] The multi-producer single-consumer data bus is a 100Gbps Ethernet bus. After data aggregation, the multimodal high-frequency data streams are pushed to the corresponding input ports of the dynamic graph construction module through a splitting logic based on data type fields, such as order data marked as 01 and news data marked as 02, to ensure non-blocking transmission.

[0042] S2. Dynamic graph construction: A dynamic financial causal relationship graph is constructed based on the multimodal high-frequency data stream; the graph includes fund entity nodes, fund heavy holdings entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text; the nodes are connected by directed edges, which represent the causal influence relationship between the nodes, and the weight of the edges is dynamically updated to quantify the influence intensity.

[0043] In one feasible embodiment of the present invention, the fund entity node represents the target subject being predicted, is the central node of the dynamic financial causal relationship graph and the carrier of the prediction results, and is created with the fund's unique code as the core identifier.

[0044] The entity nodes representing the fund's top holdings represent the core components of the fund's investment portfolio. They are the most direct carriers of market risks and events, and are created using the stock's unique code as the core identifier.

[0045] The industry sector index nodes represent the industry sectors to which the fund's heavily held stocks belong. They serve as a mid-level bridge connecting individual stocks with the macroeconomy, used to capture industry-specific risks and opportunities. They are pre-created based on industry classification standards, such as SW801120.

[0046] The macroeconomic indicator nodes represent macroeconomic driving factors that influence the overall market environment and risk appetite. Their release is an important market event, and they are created in advance, such as CPI (Consumer Price Index) and PMI (Purchasing Managers Index).

[0047] The key event entity nodes represent immediate catalysts extracted from texts such as news and announcements that can influence specific entities. They are the core source of the graph's dynamism and are extracted from real-time news text data through named entity recognition and event extraction algorithms.

[0048] In one feasible embodiment of the present invention, the specific process of constructing the dynamic financial causal relationship graph includes: initializing the graph structure, establishing holding relationship edges between fund entity nodes and their heavily held stock entity nodes, and establishing subordinate relationship edges between the heavily held stock entity nodes and their respective industry sector index nodes.

[0049] Within a preset rolling time window, an improved transfer entropy algorithm is used to calculate the directed information flow intensity between time series data corresponding to different nodes. When the information flow intensity exceeds a preset causal association threshold, a directed causal edge is established or updated between the corresponding node pairs, and the weight of the edge is assigned to the calculated information flow intensity value.

[0050] It should be noted that the improved transfer entropy algorithm introduces a sliding window adaptive bandwidth estimation to cope with the non-stationary characteristics of financial time series, and deletes the edge when the information flow intensity is lower than the causal association threshold and the edge has existed for more than a preset failure period.

[0051] In a specific example, the improved transfer entropy algorithm specifically includes: (1) Time series preprocessing: Z-score normalization is applied to the input node time series, and then... (1) Criteria for removing outliers; (2) Adaptive bandwidth estimation of sliding window: The bandwidth is calculated by kernel density estimation based on cross-validation. The sliding window size is set to 500 microseconds. The window size is dynamically adjusted according to the volatility of the time series. For example, when the volatility is >0.5%, the window is reduced to 300 microseconds, and when the volatility is <0.1%, the window is expanded to 800 microseconds; (3) Optimization of transfer entropy calculation: The complexity is reduced by an approximate calculation method based on fast Fourier transform. The transfer entropy value is mapped to the [0,1] interval through min-max normalization to obtain the information flow intensity; (4) Parallelization of algorithm: Parallel calculation of multiple node pairs is achieved through the CUDA core of GPU such as NVIDIA A100. The calculation delay of a single node pair is controlled within 10 microseconds.

[0052] Among them, the time series of price nodes is its microsecond-level price fluctuation sequence, the time series of event nodes is the occurrence frequency sequence, and the graph topology and edge weights are updated iteratively at a millisecond-level frequency.

[0053] For example, the preset rolling time window specifically includes: price-related nodes set to 500 milliseconds, event-related nodes set to 1 second, determined based on backtesting of high-frequency historical data.

[0054] The preset causal correlation threshold can be set to 0.25, which is determined by the historical data calibration method.

[0055] The preset failure period specifically includes: price-related nodes are set to 100 milliseconds, and event-related nodes are set to 300 milliseconds.

[0056] This invention elevates the fund return prediction problem from the traditional time series analysis paradigm to the network structure analysis paradigm by constructing a dynamic financial causal relationship graph. It can capture return fluctuations caused by changes in market microstructure and information transmission that are imperceptible to traditional models, significantly improving the depth and breadth of prediction.

[0057] S3. Temporal graph encoding: Temporal graph convolutional encoding is performed on the dynamic financial causal relationship graph to generate dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers.

[0058] In one feasible embodiment of the present invention, the specific generation process of the dynamic state embedding vector of the target fund entity node includes: in each temporal graph convolutional layer, performing neighbor information aggregation on any node, and weighting and aggregating the state embedding vectors of all its neighbor nodes at the previous time step through an aggregation function with attention mechanism, wherein the attention weight is jointly determined by the node's own features, the features of neighbor nodes, and the weights of the connecting edges.

[0059] The aggregated neighbor information is concatenated with the node's own state embedding vector from the previous time step and then input into a gated recurrent unit for nonlinear transformation to generate the node's state embedding vector for the current time step.

[0060] After multi-layer temporal graph convolution propagation, the target fund entity node generates a dynamic state embedding vector that integrates multi-level neighbor information and high-frequency dynamic evolution patterns from the entire graph.

[0061] Specifically, the attention weight calculation in the neighbor information aggregation includes: (1) Neighbor information aggregation: for any node in the graph At any moment The results of its neighbor information aggregation It is calculated using an aggregation function with an attention mechanism: ,in Represents a node The set of all first-order incoming edge neighbor nodes, Neighboring nodes In the previous moment The state embedding vector, It is attention weight.

[0062] It should be noted that the specific formula for calculating the attention weight is as follows: ,in It is a learnable weight matrix. It is a learnable attention vector, where || represents the vector concatenation operation. It is a connection node arrive The current weights of the directed edges are used, LeakyReLU is the activation function, and softmax is used to normalize the sum of the attention weights of all neighbors to one. This design ensures that the aggregation process not only considers the features of the node itself and the features of its neighbors, but also explicitly incorporates the causal strength information of the edges.

[0063] (2) Node state update: update the aggregated neighbor information With nodes The embedding vector of its own state at the previous time step By concatenating the vectors, we obtain a combined vector. The combined vector is then input into a gated loop unit.

[0064] The gated loop unit controls the information flow by updating and resetting gates, performs nonlinear transformations, and finally outputs a node. At the present moment Updated state embedding vector After four layers of temporal graph convolution propagation, information spreads from the first-order neighbors of the target fund entity node to the fourth-order neighbors, covering multi-order interaction relationships across the entire graph.

[0065] Ultimately, the target fund entity node outputs at the last level. This is its dynamic state embedding vector, which has 512 dimensions and integrates the fund itself, its holdings, its industry, the macro environment, and market sentiment into a high-frequency dynamic evolution pattern.

[0066] This invention employs a temporal graph convolutional network to encode dynamic graphs, enabling it to adaptively learn complex, nonlinear, high-frequency interaction patterns among financial entities, thereby enhancing the robustness and adaptability to market dynamics.

[0067] S4. Fund Return Prediction: The dynamic state is embedded into a vector and input into a cross-scale attention decoder, which outputs the predicted fund return value and prediction confidence within a specific time span in the future, such as a time window of 100 milliseconds to 500 milliseconds.

[0068] In one feasible embodiment of the present invention, the cross-scale attention decoder comprises multiple parallel self-attention heads.

[0069] The multiple parallel self-attention heads extract features at corresponding time scales, specifically including ultra-short-term attention heads, short-term attention heads, and medium-term attention heads.

[0070] It should be noted that the ultra-short-term attention head is configured to focus only on local time windows to extract ultra-short-term time scale features; the short-term attention head is configured to focus on medium-term time windows to extract short-term time scale features; and the medium-term attention head is configured to focus on global time windows to extract medium-term time scale features.

[0071] In one specific example, the ultra-short-term attention head focuses on a local time window of 50-100 milliseconds, covering the most recent 20-30 historical time steps of the dynamic state embedding vector, which corresponds to the need to capture instantaneous fluctuations in high-frequency trading.

[0072] The short-term attention focus is on a medium time window of 100-300 milliseconds, covering the most recent 50-80 historical time steps, which corresponds to the identification of short-term trends.

[0073] The mid-term attention focus is on a global time window of 300-1000 milliseconds, covering the most recent 100-200 historical time steps, which corresponds to the mid-term market environment impact assessment.

[0074] It should also be noted that the starting time of each window mentioned above is based on the current prediction time, and the historical sequence range is determined by a backward sliding method.

[0075] It should be added that the output coordination mechanism of the above three parallel attention heads is as follows: (1) Output fusion method: attention weighted fusion is adopted, that is, first calculate the contribution weight of each head feature, such as the initial weight of the ultra-short-term head is 0.4, the short-term head is 0.3, and the medium-term head is 0.3. Then, it is dynamically adjusted according to the current market volatility. For example, when the volatility is >0.05%, the weight of the ultra-short-term head is increased to 0.5 and the weight of the medium-term head is reduced to 0.2; when the volatility is <0.02%, the weight of the medium-term head is increased to 0.4 and the weight of the ultra-short-term head is reduced to 0.3. Finally, the fused feature is obtained by weighted summation; (2) Conflict handling rules: when the features of different heads show conflict in the prediction direction, such as the ultra-short-term head predicting a return of +0.06% and the medium-term head predicting -0.04%, the direction of the head with lower information entropy is selected as the basis for judgment based on the feature information entropy. The prediction value is adjusted in combination with the features of other heads. For example, the information entropy of the ultra-short-term head is 0.2 and that of the medium-term head is 0.5. Then, the prediction value is adjusted to +0.03% based on the ultra-short-term direction.

[0076] It should be explained that the cross-scale attention decoder aims to analyze the historical sequence of the target fund's dynamic state embedding vector, capture its dependency patterns at different time scales, and jointly output the point prediction, i.e., the fund return prediction value, and the uncertainty estimate, i.e., the prediction confidence.

[0077] In one feasible embodiment of the present invention, the specific process of obtaining the fund return prediction value and prediction confidence includes: importing the dynamic state embedding vector as input into a cross-scale attention decoder.

[0078] The multiple parallel self-attention heads calculate the query vector, key vector, and value vector corresponding to the dynamic state embedding vector within their respective scale ranges.

[0079] The query vector, key vector, and value vector corresponding to the dynamic state embedding vector output by each attention head are concatenated to obtain the context vector of the dynamic state embedding vector.

[0080] Specifically, the query vector, key vector, and value vector corresponding to the dynamic state embedding vectors output by the ultra-short-term, short-term, and medium-term attention heads are arranged in ultra-short-term order. short term The sequential concatenation in the middle stage yields a concatenated vector of 64×3=192 dimensions.

[0081] Further, a linear projection layer with a weight matrix of 192×512 dimensions is used to compress the concatenated vector to 512 dimensions using He normalization. Then, a LayerNorm layer is used to normalize the vector with the normalization axis as the feature dimension, finally resulting in a 512-dimensional context vector.

[0082] The context vector is processed through a fully connected network to output the predicted fund return and prediction confidence level over a specific future time span.

[0083] It should be noted that the fully connected network includes two fully connected layers, namely the yield prediction head and the confidence prediction head.

[0084] In a specific example, the detailed configuration of the fully connected network is as follows: (1) a return prediction head, which includes one hidden layer (256 dimensions) with LeakyReLU activation function and a negative slope of 0.01, and one output layer (1 dimension) with a linear activation function. The weights of the hidden layer are initialized using He normality, and the weights of the output layer are initialized using random normality.

[0085] (2) Confidence prediction head, which shares the hidden layer with the return prediction head, and only has an independently set output layer, i.e., 1-dimensional, with the activation function being Sigmoid. The output layer weights are initialized using random normal distribution.

[0086] It should also be explained that the predicted fund return and the prediction confidence are obtained indirectly from the fusion of context vectors, rather than directly from the concatenation of query vector, key vector and value vector. The query vector, key vector and value vector are only used to generate the context vector and do not participate in the final prediction themselves.

[0087] S5. Transaction instruction generation: Generate high-frequency trading instructions based on the fund return forecast and forecast confidence level, and send them to the exchange through the trading interface.

[0088] In one feasible embodiment of the present invention, the high-frequency trading instruction is a structured digital message, including at least a trading account, fund target code, buy / sell direction, order type, order quantity, and order price; wherein, the buy / sell direction is determined by the positive or negative of the predicted rate of return, and the order quantity is proportional to the product of the absolute value of the predicted rate of return and the prediction confidence level.

[0089] For example, the number of orders = base units × (absolute value of predicted rate of return × prediction confidence level) ÷ 0.0005.

[0090] The specific process of generating high-frequency trading instructions based on the predicted fund return and the prediction confidence level includes: generating a buy or sell instruction corresponding to the direction of the return only when the absolute value of the predicted return is greater than a preset return threshold and the prediction confidence level is greater than a preset confidence level threshold. Specifically, if the predicted return is positive, a buy instruction is generated; if the predicted return is negative, a sell instruction is generated.

[0091] In one specific example, the preset return threshold is 0.05%, the preset confidence threshold is 0.8%, and both the preset return threshold and the preset confidence threshold can be dynamically adjusted according to market volatility.

[0092] For example, when market volatility is greater than 1%, the preset yield threshold is increased to 0.08%; when market volatility is less than 0.3%, the preset yield threshold is decreased to 0.03%.

[0093] This invention provides a quantitative basis for risk control and position management by offering predicted returns and confidence assessments of the predicted results generated through a specific network structure, enabling high-frequency trading strategies to conduct refined risk management while pursuing high returns.

[0094] Example 2

[0095] Please see Figure 2 As shown, the present invention provides a fund return prediction system that supports high-frequency trading, and the specific modules are distributed as follows: data stream acquisition module, dynamic graph construction module, time series graph encoding module, fund return prediction module, and trading instruction generation module.

[0096] The data stream acquisition module is connected to the dynamic graph construction module, the dynamic graph construction module is connected to the time series graph encoding module, the time series graph encoding module is connected to the fund return prediction module, and the fund return prediction module is connected to the transaction instruction generation module.

[0097] The data stream acquisition module acquires and processes multimodal high-frequency data streams in real time, including exchange push data streams, financial information data streams, and real-time economic data streams.

[0098] The dynamic graph construction module constructs a dynamic financial causal relationship graph based on the multimodal high-frequency data stream. The graph includes fund entity nodes, fund heavy holdings entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text. The nodes are connected by directed edges, which represent the causal influence relationship between the nodes, and the weight of the edges is dynamically updated to quantify the influence intensity.

[0099] The temporal graph encoding module performs temporal graph convolutional encoding on the dynamic financial causal relationship graph, and generates dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers.

[0100] The fund return prediction module embeds the dynamic state into a vector input to a cross-scale attention decoder, and outputs the predicted fund return value and prediction confidence level for a specific future time span.

[0101] The trading instruction generation module generates high-frequency trading instructions based on the fund return forecast and the forecast confidence level, and sends them to the exchange through the trading interface.

[0102] The above embodiments can be implemented, in whole or in part, by software, hardware, firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, in the form of a computer program product.

[0103] Those skilled in the art will recognize that the algorithmic steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this application.

[0104] In addition, the functional modules in the various embodiments of this application can be integrated into one processing module, or each module can exist physically separately, or two or more modules can be integrated into one module.

[0105] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

[0106] Finally, the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for predicting fund returns that supports high-frequency trading, characterized in that: include: S1. Data Stream Acquisition: Real-time acquisition and processing of multimodal high-frequency data streams, including exchange push data streams, financial information data streams, and real-time economic data streams; S2. Dynamic graph construction: A dynamic financial causal relationship graph is constructed based on the multimodal high-frequency data stream; the graph includes fund entity nodes, fund top holdings entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text. Nodes are connected by directed edges, which represent the causal relationship between nodes, and the weights of the edges are dynamically updated to quantify the intensity of the influence. S3. Temporal graph encoding: Temporal graph convolutional encoding is performed on the dynamic financial causal relationship graph to generate dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers. S4. Fund return prediction: The dynamic state is embedded into a vector and input into a cross-scale attention decoder, which outputs the predicted fund return value and prediction confidence level within a specific future time span. S5. Transaction instruction generation: Generate high-frequency trading instructions based on the fund return forecast and forecast confidence level, and send them to the exchange through the trading interface.

2. The fund return prediction method supporting high-frequency trading according to claim 1, characterized in that: The transaction push data stream includes order data, transaction data, and buy / sell order book depth snapshot data; The financial information data stream includes real-time news text data and social media sentiment index data; The real-time economic data stream includes published indicators and key parameters.

3. The fund return prediction method supporting high-frequency trading according to claim 2, characterized in that: The specific process for processing multimodal high-frequency data streams includes: Each accessed data record is appended with a timestamp generated by a hardware time synchronization device, which keeps synchronized with the standard time source of the National Time Service Center through a precise time protocol. Multimodal, high-frequency data streams with timestamps are fed into a multi-producer, single-consumer data bus based on a lock-free circular buffer to achieve non-blocking data aggregation.

4. The fund return prediction method supporting high-frequency trading according to claim 1, characterized in that: The fund entity node represents the target entity being predicted. It is the central node of the dynamic financial causal relationship diagram and the carrier of the prediction results. It is created with the fund's unique code as the core identifier. The entity nodes of the fund's top holdings represent the core components of the fund's investment portfolio. They are the most direct carriers of market risks and market events, and are created using the stock's unique code as the core identifier. The industry sector index nodes represent the industry sectors to which the fund's heavily held stocks belong. They serve as a mid-level bridge connecting individual stocks with the macroeconomy, used to capture industry-specific risks and opportunities, and are pre-created based on industry classification standards. The macroeconomic indicator nodes represent macroeconomic driving factors that influence the overall market environment and risk appetite. Their release is an important market event, and they are obtained through pre-creation. The key event entity nodes represent immediate catalysts extracted from texts such as news and announcements that can influence specific entities. It is the core source of the dynamics of the graph, and it is extracted from real-time news text data through named entity recognition and event extraction algorithms.

5. The fund return prediction method supporting high-frequency trading according to claim 4, characterized in that: The specific process of constructing the dynamic financial causal relationship graph includes: Initialize the graph structure to establish holding relationship edges between fund entity nodes and their top holding entity nodes, and establish subordinate relationship edges between top holding entity nodes and their respective industry sector index nodes; Within a preset rolling time window, an improved transfer entropy algorithm is used to calculate the directed information flow intensity between time series data corresponding to different nodes. When the information flow intensity exceeds a preset causal association threshold, a directed causal edge is established or updated between the corresponding node pairs, and the weight of the edge is assigned to the calculated information flow intensity value. Among them, the time series of price nodes is its microsecond-level price fluctuation sequence, the time series of event nodes is the occurrence frequency sequence, and the graph topology and edge weights are updated iteratively at a millisecond-level frequency.

6. The fund return prediction method supporting high-frequency trading according to claim 1, characterized in that: The specific generation process of the dynamic state embedding vector of the target fund entity node includes: In each temporal graph convolutional layer, neighbor information aggregation is performed on any node. The state embedding vectors of all its neighboring nodes in the previous time step are weighted and aggregated through an aggregation function with attention mechanism. The attention weights are jointly determined by the node's own features, the features of its neighboring nodes, and the weights of the connecting edges. The aggregated neighbor information is concatenated with the node's own state embedding vector from the previous time step and then input into the gated recurrent unit for nonlinear transformation to generate the node's state embedding vector at the current time step. After multi-layer temporal graph convolution propagation, the target fund entity node generates a dynamic state embedding vector that integrates multi-level neighbor information and high-frequency dynamic evolution patterns from the entire graph.

7. The fund return prediction method supporting high-frequency trading according to claim 1, characterized in that: The cross-scale attention decoder contains multiple parallel self-attention heads; The multiple parallel self-attention heads extract features at corresponding time scales, specifically including ultra-short-term attention heads, short-term attention heads, and medium-term attention heads.

8. The fund return prediction method supporting high-frequency trading according to claim 7, characterized in that: The specific process for obtaining the fund return forecast and the forecast confidence level includes: The dynamic state embedding vector is imported as input into the cross-scale attention decoder; The multiple parallel self-attention heads calculate the query vector, key vector, and value vector corresponding to the dynamic state embedding vector within their respective scale ranges. The query vector, key vector, and value vector corresponding to the dynamic state embedding vector output by each attention head are concatenated to obtain the context vector of the dynamic state embedding vector. The context vector is processed through a fully connected network to output the predicted fund return and prediction confidence level over a specific future time span.

9. The fund return prediction method supporting high-frequency trading according to claim 1, characterized in that: The high-frequency trading instruction is a structured digital message, which includes at least the trading account, fund code, buy / sell direction, order type, order quantity, and order price; wherein, the buy / sell direction is determined by the positive or negative of the predicted rate of return, and the order quantity is proportional to the product of the absolute value of the predicted rate of return and the prediction confidence level; The specific process of generating high-frequency trading instructions based on the predicted fund return and the prediction confidence level includes: generating a buy or sell instruction corresponding to the direction of the return only when the absolute value of the predicted return is greater than a preset return threshold and the prediction confidence level is greater than a preset confidence level threshold. Specifically, if the predicted return is positive, a buy instruction is generated; if the predicted return is negative, a sell instruction is generated.

10. A system for executing the fund return prediction method supporting high-frequency trading as described in claims 1-9, characterized in that: include: The data stream acquisition module acquires and processes multimodal high-frequency data streams in real time, including exchange push data streams, financial information data streams, and real-time economic data streams. The dynamic graph construction module constructs a dynamic financial causal relationship graph based on the multimodal high-frequency data stream; the relationship graph includes fund entity nodes, fund top holding stock entity nodes, industry sector index nodes, macroeconomic indicator nodes, and key event entity nodes extracted from news text. Nodes are connected by directed edges, which represent the causal relationship between nodes, and the weights of the edges are dynamically updated to quantify the intensity of the influence. The temporal graph encoding module performs temporal graph convolutional encoding on the dynamic financial causal relationship graph, and generates dynamic state embedding vectors of target fund entity nodes through stacked temporal graph convolutional layers; The fund return prediction module embeds the dynamic state into a vector input to a cross-scale attention decoder and outputs the predicted fund return value and prediction confidence level within a specific future time span. The trading instruction generation module generates high-frequency trading instructions based on the fund return forecast and the forecast confidence level, and sends them to the exchange through the trading interface.