A method and system for optimizing freight rate transaction strategies based on machine learning

By combining multi-dimensional data processing and hybrid machine learning models with reinforcement learning, freight rate trading strategies are optimized, solving the problems of insufficient data processing and sluggish strategy response in existing technologies, and achieving efficient freight rate trading strategy optimization and risk management.

CN122198822APending Publication Date: 2026-06-12GUANGZHOU GUANGHANG FINANCIAL SERVICES TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGZHOU GUANGHANG FINANCIAL SERVICES TECHNOLOGY CO LTD
Filing Date
2026-03-04
Publication Date
2026-06-12

Smart Images

  • Figure CN122198822A_ABST
    Figure CN122198822A_ABST
Patent Text Reader

Abstract

The application discloses a freight rate transaction strategy optimization method and system based on machine learning, relates to the field of machine learning, and comprises the following steps: constructing a standardized database through data cleaning and feature engineering, and predicting a rate trend by adopting an LSTM-XGBoost hybrid model; establishing a multi-objective optimization model to generate an initial transaction strategy, and balancing benefits and risks by using reinforcement learning dynamic parameter adjustment after simulation environment test and verification; and deploying a real-time monitoring system to realize strategy closed-loop optimization.The application has the advantages that through multi-dimensional data accurate preprocessing, LSTM and XGBoost hybrid model prediction, genetic algorithm strategy generation and reinforcement learning dynamic parameter adjustment, and in combination with a closed-loop iteration mechanism, the application can effectively adapt to freight rate market changes, take into account benefit improvement and risk control, and significantly optimize the scientificity and stability of transaction decisions.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of machine learning, and in particular to a method and system for optimizing freight rate trading strategies based on machine learning. Background Technology

[0002] With the growth of international trade, increased supply chain uncertainty, and fluctuations in fuel costs, freight rates are exhibiting high-frequency volatility, making traditional fixed-contract models ineffective in hedging risks. Meanwhile, the widespread adoption of digital technologies has driven the development of the freight derivatives market, providing financial tools for freight rate risk management.

[0003] Current freight rate trading strategies rely on a single data source or static historical data. Preprocessing is limited to basic cleaning, lacking the ability to integrate multi-dimensional data and accurately handle dynamic outliers and missing values, resulting in a weak data foundation. At the model level, most employ single algorithms, which either fail to capture long-term time-series dependence of freight rates or cannot fit complex nonlinear feature relationships, leading to insufficient prediction accuracy and weak generalization ability. Strategy generation often focuses on a single revenue objective, ignoring multiple constraints such as capacity and capital, lacking differentiated design, and struggling to adapt to different cargo types and route scenarios. Risk assessment remains at a basic level, failing to incorporate VaR models and multi-scenario stress testing, resulting in insufficient targeted risk control. Furthermore, they generally lack dynamic optimization and closed-loop iteration mechanisms, with fixed and rigid strategy parameters. They cannot adaptively adjust through reinforcement learning to cope with market fluctuations, nor can they monitor and trigger secondary optimizations in real time. This leads to slow response to unexpected situations and difficulty in balancing the dynamic relationship between revenue and risk. Summary of the Invention

[0004] To improve existing methods and systems, this paper presents a machine learning-based method and system for optimizing freight rate trading strategies. This method utilizes multi-dimensional data preprocessing, LSTM and XGBoost hybrid model prediction, genetic algorithm generation strategy, and reinforcement learning dynamic parameter tuning, combined with a closed-loop iterative mechanism, to effectively adapt to changes in the freight rate market and balance revenue improvement with risk management.

[0005] To achieve the above objectives, the technical solution adopted by the present invention is as follows: A machine learning-based method for optimizing freight rate trading strategies includes: We collect four types of data: freight supply and demand, macroeconomics, regional logistics support, and historical transactions. We then use an adaptive threshold algorithm to filter out outliers and time-series interpolation to fill in missing values. After standardizing the heterogeneous data, we build a basic database with a timestamp index. Extract time-series, correlation, and trend features from preprocessed data, use mutual information entropy and random forest algorithms to screen key features, remove redundant items, and establish a dynamic update mechanism; A freight rate prediction model based on a hybrid machine learning model is constructed, which integrates a long short-term memory network and a gradient boosting tree model. LSTM captures the long-term temporal dependence of freight rates, and XGBoost fits the nonlinear feature associations. Based on the prediction results, a multi-objective function is constructed, with constraints such as transportation capacity and capital. The function is solved by a genetic algorithm to generate a differentiated initial strategy containing parameters such as buy and sell thresholds and position ratios. Set up a simulated trading environment, input historical and real-time data to test the initial strategy, use the value at risk model and stress testing to evaluate the risk resistance capability in multiple scenarios, quantify the stability of returns and the effectiveness of risk control, and generate a report. Based on the verification report, a reinforcement learning agent is constructed. Market conditions, strategy results, and risk indicators are taken as inputs, and the strategy parameters are adaptively adjusted with the goal of improving returns and reducing risks. The optimization strategy is deployed to the actual system, and a real-time monitoring module is built to capture market, strategy and risk data in real time. When the indicators deviate from the expectations, a secondary optimization process is triggered.

[0006] Preferably, the process of collecting four data points—freight supply and demand, macroeconomic data, regional logistics support data, and historical transaction data—and then filtering outliers using an adaptive threshold algorithm, filling in missing values ​​using time-series interpolation, and standardizing heterogeneous data to construct a basic database with a timestamp index specifically includes: Collect multi-source freight rate correlation data, which covers four dimensions: freight market supply and demand data, macroeconomic data, regional logistics support data, and historical transaction data. The data preprocessing process employs an adaptive threshold algorithm to filter out abnormal data, fills in missing values ​​using a data completion model, standardizes heterogeneous data, constructs a unified format freight rate database, and establishes a data timestamp index.

[0007] Preferably, the step of extracting time-series, correlation, and trend features from preprocessed data, using mutual information entropy and random forest algorithms to screen key features, eliminating redundant items, and establishing a dynamic update mechanism specifically includes: Based on the preprocessed basic data, three core features are extracted: time-series features, correlation features, and trend features. Time-series features include daily fluctuation range of freight rates, weekly change cycle, monthly trend and quarterly seasonal fluctuation pattern. Correlation features include the correlation between supply and demand gap and freight rates, the impact coefficient of macroeconomic indicators on freight rates, and the linkage between regional logistics support capacity and freight rates. Trend features include the duration of freight rate increases / decreases, the precursor features of inflection points, and the characteristics of sudden changes in freight rates under extreme market conditions. The mutual information entropy combined with the random forest algorithm is used to screen key features, eliminate redundant and weakly correlated features, and construct a feature set. Establish a dynamic feature update mechanism to adjust feature dimensions and weights in real time according to market changes.

[0008] Preferably, the construction of the fare prediction model based on a hybrid machine learning model, which integrates a long short-term memory network and a gradient boosting tree model, uses LSTM to capture long-term temporal dependencies in fares, and XGBoost to fit nonlinear feature associations, specifically includes: A freight rate prediction model based on a hybrid machine learning model is constructed by fusing a long short-term memory network and a gradient boosting tree to build the hybrid prediction model. The LSTM model is used to capture long-term dependencies in freight rate time series data and explore the inheritance pattern of freight rates in different time periods. The XGBoost model is used to capture the non-linear correlation between freight rates and various core features and strengthen the influence of key features on freight rate prediction results. An attention mechanism is introduced during model construction to assign higher weights to time period data and features that affect freight rate fluctuations, and a cross-validation method is used to divide the training set, validation set and test set.

[0009] Preferably, the step of constructing a multi-objective function based on the prediction results, using constraints such as transportation capacity and capital, and solving it through a genetic algorithm to generate a differentiated initial strategy containing buy / sell thresholds and position ratio parameters specifically includes: Based on the freight rate forecast results of the hybrid model, a multi-objective optimization objective function is constructed in combination with the needs of freight transaction scenarios; The optimization objectives include maximizing freight revenue, minimizing transaction risk, and optimizing capital utilization. The constraints include capacity supply limitations, capital turnover cycle, market liquidity constraints, and policy compliance requirements. The objective function is solved by a genetic algorithm to generate an initial trading strategy, which includes the freight rate buy threshold, sell threshold, position ratio, trading frequency and risk hedging scheme. Differentiated strategy parameters are set for different types of cargo and different routes to adapt to diverse freight transaction scenarios.

[0010] Preferably, the process of setting up a simulated trading environment, inputting historical and real-time data to test the initial strategy, using a value at risk model and stress testing to evaluate risk resilience in multiple scenarios, quantifying return stability and risk control effectiveness, and generating a report specifically includes: A simulated freight trading environment was set up, historical data and real-time incremental data were input into the simulation system, simulated trading was executed according to the initial trading strategy, and the profit data, risk data and strategy execution efficiency were recorded during the trading process. The risk assessment uses a VaR model combined with stress testing to evaluate the strategy's risk resilience under different scenarios such as normal market fluctuations, extreme market shocks, and policy adjustments. It calculates the Sharpe ratio, maximum drawdown rate, and win rate indicators to quantify the strategy's return stability and risk control capabilities, and generates a simulation verification report.

[0011] Preferably, the construction of the reinforcement learning agent based on the verification report, taking market state, strategy results, and risk indicators as inputs, and aiming at improving returns and reducing risks, adaptively adjusting strategy parameters specifically includes: Based on the dynamic optimization of policies using reinforcement learning, and using simulation verification reports as a basis, a reinforcement learning agent is constructed. The state of the freight market, the results of strategy execution, and risk indicators are used as the input state of the intelligent agent. The reward function is to increase profits and reduce risks. The trading strategy parameters are continuously adjusted through trial and error learning. During the optimization process, a market adaptive mechanism is introduced. The agent captures changes in the market environment in real time, dynamically adjusts the freight rate buying and selling thresholds, the frequency of position adjustment, and the risk hedging ratio, corrects parameters in the initial strategy that do not match the actual market demand, retains the core logic of the strategy, and realizes dynamic iterative optimization of the strategy.

[0012] Preferably, the step of deploying the optimization strategy to the actual system and building a real-time monitoring module to synchronously capture market, strategy, and risk data, and triggering a secondary optimization process when the indicators deviate from expectations, specifically includes: The optimized freight rate trading strategy is deployed to the actual freight trading system. A real-time data acquisition and monitoring module is established to capture real-time market data, strategy execution data, and risk indicator data, and dynamically compare them with the optimized strategy parameters. When the market experiences abnormal fluctuations, the weights of core features change, or the strategy's execution effect deviates from the expected target, a secondary optimization process is triggered to perform a closed-loop iteration of the strategy.

[0013] Furthermore, a machine learning-based freight rate trading strategy optimization system is proposed, including: Data acquisition module: responsible for acquiring multi-source data, filtering outliers through adaptive threshold algorithm, completing missing data through time series interpolation, standardizing heterogeneous data, and building a basic database with timestamp index; Feature engineering module: Extracts time series, correlation and trend features from preprocessed data, uses mutual information entropy and random forest algorithm to screen key features, and dynamically updates feature set to adapt to market changes; Hybrid prediction model module: It integrates LSTM and XGBoost models. LSTM captures the long-term time-series dependence of freight rates, XGBoost fits non-linear feature correlations, introduces an attention mechanism to strengthen the weights of key features, and improves generalization ability through cross-validation. Strategy generation module: Constructs a multi-objective function based on the prediction results, and uses a genetic algorithm to generate differentiated initial strategies with constraints such as transportation capacity and capital. Simulated trading and risk assessment module: Build a simulated environment to test strategies, combine VaR model and stress test to evaluate risk resistance capability, and quantify return stability and risk control effectiveness; Reinforcement learning optimization module: Based on the validation report, a reinforcement learning agent is constructed, the strategy parameters are dynamically adjusted, and adaptive optimization is achieved through feedback from market status and risk indicators; Closed-loop iteration module: Deploys optimization strategies to the actual system, monitors market data and strategy performance in real time, and triggers a secondary optimization process when indicators deviate from expectations; Processor: The processor is used to handle the calculation process of each formula and the construction calculation process of each model.

[0014] Compared with the prior art, the advantages of the present invention are: By collecting data from multiple dimensions and employing adaptive preprocessing techniques, coupled with timestamp indexing to construct a high-quality database, a solid data foundation is laid for strategy implementation. Mutual information entropy and random forests are used to jointly select features and establish a dynamic update mechanism. Combined with a hybrid model of LSTM and XGBoost, the system accurately captures long-term temporal dependencies and nonlinear correlations in freight rates, achieving prediction accuracy far exceeding that of a single model. Differentiated strategies are generated using a multi-objective function combined with genetic algorithms. Risk is comprehensively assessed through VaR models and stress testing, and then dynamically tuned using reinforcement learning agents to achieve dual optimization of returns and risk control. Relying on real-time monitoring and a closed-loop iteration mechanism, the strategy can quickly adapt to market fluctuations and scenario changes, balancing versatility and specificity. This effectively addresses the problems of high market uncertainty and difficulty in risk control in freight rate trading, significantly improving the scientific nature and stability of trading decisions. Attached Figure Description

[0015] Figure 1 This is a schematic diagram of the method proposed in this invention; Figure 2 This is a schematic diagram of the data acquisition and preprocessing proposed in this invention; Figure 3 This is a schematic diagram illustrating the construction of the freight rate feature engineering proposed in this invention; Figure 4 This is a schematic diagram illustrating the construction of the freight rate prediction model proposed in this invention; Figure 5 This is a schematic diagram illustrating the generation of the initial freight rate trading strategy proposed in this invention; Figure 6 This is a schematic diagram illustrating the simulation verification and risk assessment of the trading strategy proposed in this invention; Figure 7 This is a schematic diagram illustrating the dynamic optimization of the strategy proposed in this invention; Figure 8 This is a schematic diagram illustrating the strategy deployment and real-time monitoring iteration proposed in this invention. Detailed Implementation

[0016] The following description is intended to disclose the invention and enable those skilled in the art to implement it. The preferred embodiments described below are merely examples, and other obvious variations will occur to those skilled in the art.

[0017] A machine learning-based freight rate trading strategy optimization system includes: Data acquisition module: responsible for acquiring multi-source data, filtering outliers through adaptive threshold algorithm, completing missing data through time series interpolation, standardizing heterogeneous data, and building a basic database with timestamp index; Feature engineering module: Extracts time series, correlation and trend features from preprocessed data, uses mutual information entropy and random forest algorithm to screen key features, and dynamically updates feature set to adapt to market changes; Hybrid prediction model module: It integrates LSTM and XGBoost models. LSTM captures the long-term time-series dependence of freight rates, XGBoost fits non-linear feature correlations, introduces an attention mechanism to strengthen the weights of key features, and improves generalization ability through cross-validation. Strategy generation module: Constructs a multi-objective function based on the prediction results, and uses a genetic algorithm to generate differentiated initial strategies with constraints such as transportation capacity and capital. Simulated trading and risk assessment module: Build a simulated environment to test strategies, combine VaR model and stress test to evaluate risk resistance capability, and quantify return stability and risk control effectiveness; Reinforcement learning optimization module: Based on the validation report, a reinforcement learning agent is constructed, the strategy parameters are dynamically adjusted, and adaptive optimization is achieved through feedback from market status and risk indicators; Closed-loop iteration module: Deploys optimization strategies to the actual system, monitors market data and strategy performance in real time, and triggers a secondary optimization process when indicators deviate from expectations; Processor: The processor is used to handle the calculation process of each formula and the construction calculation process of each model.

[0018] See Figure 1 As shown, a machine learning-based method for optimizing freight rate trading strategies includes: Step 1: Collect four types of data: freight supply and demand, macroeconomic data, regional logistics support data, and historical transaction data. Use an adaptive threshold algorithm to filter out outliers and time-series interpolation to fill in missing values. After standardizing the heterogeneous data, build a basic database with timestamp indexes. Step 2: Extract time series, correlation, and trend features from the preprocessed data, use mutual information entropy and random forest algorithms to screen key features, remove redundant items, and establish a dynamic update mechanism; Step 3: Construct a fare prediction model based on a hybrid machine learning model, which integrates a long short-term memory network and a gradient boosting tree model, uses LSTM to capture long-term temporal dependencies of fares, and uses XGBoost to fit nonlinear feature associations; Step 4: Construct a multi-objective function based on the prediction results, using constraints such as transportation capacity and capital, and solve it through a genetic algorithm to generate a differentiated initial strategy containing buy and sell thresholds and position ratio parameters; Step 5: Set up a simulated trading environment, input historical and real-time data to test the initial strategy, use the value at risk model and stress testing to evaluate the risk resistance capability in multiple scenarios, quantify the stability of returns and the effectiveness of risk control, and generate a report; Step Six: Based on the verification report, construct a reinforcement learning agent, taking market status, strategy results, and risk indicators as inputs, and adaptively adjusting strategy parameters with the goal of improving returns and reducing risks; Step 7: Deploy the optimization strategy to the actual system, build a real-time monitoring module to synchronously capture market, strategy and risk data, and trigger a secondary optimization process when the indicators deviate from expectations.

[0019] See Figure 2 As shown, four data points are collected: freight supply and demand, macroeconomic data, regional logistics support data, and historical transaction data. An adaptive threshold algorithm is used to filter out outliers, and time-series interpolation is used to fill in missing values. After standardizing the heterogeneous data, a basic database with a timestamp index is constructed, specifically including: Collect multi-source freight rate correlation data, which covers four dimensions: freight market supply and demand data, macroeconomic data, regional logistics support data, and historical transaction data. The data preprocessing process employs an adaptive threshold algorithm to filter out abnormal data, fills in missing values ​​using a data completion model, standardizes heterogeneous data, constructs a unified format freight rate database, and establishes a data timestamp index.

[0020] Specifically, the data preprocessing stage is executed in three steps: First, abnormal data filtering. An adaptive threshold algorithm first performs normal distribution statistics on data of each dimension, dynamically sets ±3σ floating thresholds, accurately identifies abnormal data such as sudden extreme values ​​and collection errors, and separately marks and retains reasonable freight rate fluctuation data under extreme market conditions to avoid accidental deletion. Second, missing value completion. For continuous data, time-series interpolation combined with the average correction of the same type of cargo and route is used to complete the missing value. For discrete data, neighborhood data voting is used to complete the missing value, ensuring the accuracy of missing value completion. Third, heterogeneous data standardization. The Z-score standardization method is used to uniformly convert economic indicators, capacity data, and efficiency data of different dimensions to the same numerical range. At the same time, text data is encoded to build a structured basic database with a unified format. Finally, a multi-dimensional timestamp index is established to associate data collection time, route dimension, cargo type and other tags to achieve data temporal correlation and traceability.

[0021] See Figure 3 As shown, time-series, correlation, and trend features are extracted from preprocessed data. Mutual information entropy and random forest algorithms are used to filter key features, remove redundant terms, and establish a dynamic update mechanism. Specifically, this includes: Based on the preprocessed basic data, three core features are extracted: time-series features, correlation features, and trend features. Time-series features include daily fluctuation range of freight rates, weekly change cycle, monthly trend and quarterly seasonal fluctuation pattern. Correlation features include the correlation between supply and demand gap and freight rates, the impact coefficient of macroeconomic indicators on freight rates, and the linkage between regional logistics support capacity and freight rates. Trend features include the duration of freight rate increases / decreases, the precursor features of inflection points, and the characteristics of sudden changes in freight rates under extreme market conditions. The mutual information entropy combined with the random forest algorithm is used to screen key features, eliminate redundant and weakly correlated features, and construct a feature set. Establish a dynamic feature update mechanism to adjust feature dimensions and weights in real time according to market changes.

[0022] Specifically, the time-series feature extraction adopts a multi-granularity time slicing method. At the daily level, it calculates the daily increase / decrease in freight rates, the proportion of the difference between the highest and lowest prices of the day, and the volatility coefficient of the average transaction price within the day. At the weekly level, it analyzes the slope of the freight rate trend over seven consecutive days, the frequency of peak and trough values ​​within the week, and the interval between them. At the monthly level, it extracts the year-on-year / month-on-month growth rate of the average monthly freight rate and the proportion of stable freight rate periods within the month. At the quarterly level, it combines cargo type attributes to mine seasonal fluctuation patterns and distinguish the differences in freight rate fluctuations between peak and off-peak seasons. The feature extraction adopts a grouped linkage analysis approach, grouping by route and cargo type to calculate the real-time linkage coefficient between supply and demand gap and freight rates, quantifying the freight rate fluctuation corresponding to a 1% change in the gap; for macroeconomic indicators, it distinguishes between direct and indirect correlation items, and extracts the lagged impact period of each indicator on freight rates; the correlation features of regional logistics supporting capabilities are further refined into the negative correlation between port loading and unloading efficiency and freight rates, the positive correlation between warehousing turnover days and forward freight rates, and the additive feature of land transportation congestion duration on short-haul route freight rates, ensuring that the correlation logic is concrete and quantifiable; The correlation between each extracted feature and the fare label is calculated by mutual information entropy. A correlation threshold of 0.3 is set to filter out weakly correlated features below the threshold. Then, a feature importance evaluation model is constructed using the random forest algorithm to calculate the contribution weight of each feature to the fare prediction result. The top 30% of core features with the highest weights are retained. At the same time, redundant features are processed by checking the correlation coefficient between features. The formula for calculating mutual information entropy is:

[0023] in, This represents the mutual information entropy value between feature X and fare label Y. A larger value indicates a stronger correlation. Let X be the joint probability density function of feature X and fare label Y. Let X be the marginal probability density function of feature X. Let Y be the marginal probability density function of the fare label.

[0024] See Figure 4 As shown, a fare prediction model based on a hybrid machine learning model is constructed, which integrates a long short-term memory network and a gradient boosting tree model. LSTM captures the long-term temporal dependence of fares, and XGBoost fits the nonlinear feature associations. Specifically, this includes: A freight rate prediction model based on a hybrid machine learning model is constructed by fusing a long short-term memory network and a gradient boosting tree to build the hybrid prediction model. The LSTM model is used to capture long-term dependencies in freight rate time series data and explore the inheritance pattern of freight rates in different time periods. The XGBoost model is used to capture the non-linear correlation between freight rates and various core features and strengthen the influence of key features on freight rate prediction results. An attention mechanism is introduced during model construction to assign higher weights to time period data and features that affect freight rate fluctuations, and a cross-validation method is used to divide the training set, validation set and test set.

[0025] Specifically, the core modules and hierarchical connections of the model are divided into a temporal feature processing module, a nonlinear feature processing module, an attention enhancement module, and a fusion output module. The temporal feature processing module is an LSTM sub-model, containing an input layer, three hidden layers, and an output layer. The input layer receives the generated temporal feature set, each hidden layer has 128 neurons and uses the tanh activation function. Features are transferred between layers through a fully connected approach. The output layer outputs intermediate results of the temporal fare prediction. The nonlinear feature processing module is an XGBoost sub-model, containing a feature mapping layer, 15 gradient boosting decision trees, and a result output layer. The feature mapping layer performs dimensionality transformation on related features and trend features. The decision tree depth is set to 6, and each tree outputs a local nonlinear prediction result. The attention enhancement module is embedded between the hidden and output layers of the LSTM sub-model. It assigns differentiated weights to temporal features of different time periods through a weight allocation matrix, increasing the weight of core time period data to 1.5-2 times that of ordinary time periods. The fusion output module connects the outputs of the two sub-models using a weighted summation method. It integrates the results of the two models through dynamic weight factors to form the final fare prediction value. The first step in model training is data partitioning. The feature set is divided into training, validation, and test sets in a 7:2:1 ratio, using 5-fold cross-validation to avoid overfitting. The second step is LSTM sub-model training, with a batch size of 32, 100 iterations, and an initial learning rate of 0.001. The Adam optimizer is used to adjust parameters, with mean squared error as the loss function. An early stopping mechanism is triggered when the validation set loss shows no decrease for five consecutive rounds to prevent overtraining. The third step is XGBoost sub-model training, with a learning rate of 0.01, a subsample ratio of 0.8, and a minimum sum of sample weights for each tree of 0.1. Grid search is used to optimize the number and depth of decision trees, with mean absolute error as the evaluation metric. The fourth step is weight fusion calibration, where the weights of the two models are determined based on the validation set prediction accuracy. The LSTM sub-model weights range from 0.4 to 0.6, and the XGBoost sub-model weights range from 0.6 to 0.4. Higher accuracy results in higher weight proportions, and the weights are dynamically adjusted in real time. To address different shipping routes such as ocean, near-sea, and inland waterway, the number of neurons in the hidden layer of the LSTM sub-model was adjusted. For ocean routes, which have stronger temporal dependencies, the number was set to 128, while for inland waterway routes, it was set to 64. To accommodate differences in cargo types such as general cargo, hazardous chemicals, and cold chain, the feature weights of the XGBoost sub-model were optimized. For hazardous chemical routes, the weight of the "frequency of transportation compliance inspections" feature was strengthened, while for cold chain routes, the proportion of the "temperature control resource saturation" feature was increased. To adapt to extreme market scenarios, an abnormal scenario label was added to the attention module, assigning high weights to data in these scenarios to enhance the model's predictive ability for unexpected situations. The input is configured differently according to the sub-model. The input of the LSTM sub-model is the extracted time-series feature subset, and the input of the XGBoost sub-model is the correlation feature and trend feature subset. Both types of input come from the same basic database. The output is divided into intermediate output and final output. The LSTM outputs the time-series dimension freight rate prediction value, and the XGBoost outputs the feature correlation dimension prediction value. After fusion, the freight rate prediction results for the next 7 days, 30 days and 90 days are output, which directly serve as the core data basis for generating the initial trading strategy in step four.

[0026] See Figure 5 As shown, a multi-objective function is constructed based on the prediction results, with constraints such as transportation capacity and capital. This function is solved using a genetic algorithm to generate differentiated initial strategies containing parameters such as buy / sell thresholds and position ratios. Specifically, these strategies include: Based on the freight rate forecast results of the hybrid model, a multi-objective optimization objective function is constructed in combination with the needs of freight transaction scenarios; The optimization objectives include maximizing freight revenue, minimizing transaction risk, and optimizing capital utilization. The constraints include capacity supply limitations, capital turnover cycle, market liquidity constraints, and policy compliance requirements. The objective function is solved by a genetic algorithm to generate an initial trading strategy, which includes the freight rate buy threshold, sell threshold, position ratio, trading frequency and risk hedging scheme. Differentiated strategy parameters are set for different types of cargo and different routes to adapt to diverse freight transaction scenarios.

[0027] Specifically, the multi-objective optimization objective function is concretely constructed, clarifying the quantitative orientation and priority of the three core optimization objectives. The profit maximization objective focuses on balancing single-batch transaction profits and annualized profits, using the predicted freight rate increase as the core anchor point, prioritizing transaction opportunities with increases exceeding 5%, while controlling the proportion of transactions on single routes and single cargo types. The risk minimization objective addresses both freight rate volatility risk and contract default risk, controlling the daily fluctuation range of freight rates within 3% and setting the contract default rate threshold below 1%. The capital utilization optimization objective requires that the proportion of idle funds not exceed 15%, ensuring that funds dynamically circulate in multiple batches and multiple scenarios of transactions. The three objectives are weighted according to the principle of "risk priority, balance between profit and efficiency," with risk control accounting for 40%, and profit and capital efficiency each accounting for 30%. The capacity supply constraints are set differently according to the route and cargo type. For ocean routes, the capacity of a single batch of transactions shall not exceed 20% of the real-time available capacity of the route, and for inland waterway routes, it shall not exceed 30%. Special cargo types such as hazardous chemicals and cold chain require matching capacity with corresponding transportation qualifications. The capital turnover constraints are combined with the settlement cycle of the cargo type. The capital turnover cycle for general cargo transactions shall not exceed 15 working days, and for cold chain and hazardous chemicals, it shall not exceed 30 working days. The market liquidity constraints require that the average transaction frequency of the trading target in the past 30 days shall not be less than 5 times / day. A genetic algorithm is used to solve the objective function, clarifying the implementation steps and key parameters. The first step is to initialize the population, generating 50 sets of randomized strategy parameter combinations as the initial population. The second step is to design a fitness function, incorporating the achievement rate of the three optimization objectives and the satisfaction of constraints into the evaluation; parameter combinations that do not meet the constraints are directly eliminated. The third step is to execute selection, crossover, and mutation operations. The selection operation retains the top 30% of parameter combinations in terms of fitness, the crossover operation uses a single-point crossover method, and the mutation operation randomly fine-tunes core parameters such as buy / sell thresholds and position ratios. The fourth step is to terminate the iteration; when the fitness of the optimal parameter combination does not improve for 10 consecutive generations, the iteration stops, and the optimal parameter set is output. Finally, a structured initial strategy is generated, and scenario adaptation rules are refined.

[0028] See Figure 6 As shown, a simulated trading environment was set up, historical and real-time data were input to test the initial strategy, and the risk value model and stress testing were used to evaluate the risk resistance capability in multiple scenarios. The stability of returns and the effectiveness of risk control were quantified and a report was generated, specifically including: A simulated freight trading environment was set up, historical data and real-time incremental data were input into the simulation system, simulated trading was executed according to the initial trading strategy, and the profit data, risk data and strategy execution efficiency were recorded during the trading process. The risk assessment uses a VaR model combined with stress testing to evaluate the strategy's risk resilience under different scenarios such as normal market fluctuations, extreme market shocks, and policy adjustments. It calculates the Sharpe ratio, maximum drawdown rate, and win rate indicators to quantify the strategy's return stability and risk control capabilities, and generates a simulation verification report.

[0029] Specifically, the multi-dimensional risk assessment adopts a combination of "VaR model + stress test + core indicator quantification". The VaR model divides the assessment units by cargo type and route, calculates the maximum possible loss per day and per week at a 95% confidence level, and clarifies the risk boundaries under different scenarios. The stress test gradually increases the impact intensity for extreme scenarios, such as increasing the decline in port loading and unloading efficiency from 20% to 50%, and tests the break-even point and default risk threshold of the strategy under extreme conditions. The core indicator quantification calculates the Sharpe ratio, maximum drawdown rate, and win rate. A Sharpe ratio below 1.2 is considered as insufficient return stability, a maximum drawdown rate exceeding 15% triggers a risk warning, and a win rate below 50% marks a defect in the effectiveness of the strategy decision-making. At the same time, the efficiency of strategy execution is evaluated, and bottlenecks at the execution level are identified.

[0030] See Figure 7 As shown, a reinforcement learning agent is constructed based on the validation report. Market state, strategy results, and risk indicators are used as inputs, with the goal of increasing returns and reducing risk. The agent adaptively adjusts the strategy parameters, specifically including: Based on the dynamic optimization of policies using reinforcement learning, and using simulation verification reports as a basis, a reinforcement learning agent is constructed. The state of the freight market, the results of strategy execution, and risk indicators are used as the input state of the intelligent agent. The reward function is to increase profits and reduce risks. The trading strategy parameters are continuously adjusted through trial and error learning. During the optimization process, a market adaptive mechanism is introduced. The agent captures changes in the market environment in real time, dynamically adjusts the freight rate buying and selling thresholds, the frequency of position adjustment, and the risk hedging ratio, corrects parameters in the initial strategy that do not match the actual market demand, retains the core logic of the strategy, and realizes dynamic iterative optimization of the strategy.

[0031] Specifically, a targeted reinforcement learning agent is constructed, employing a deep deterministic policy gradient algorithm to build a multi-module architecture, comprising three core modules: state perception, policy decision-making, and learning update. The state perception module is responsible for collecting real-time freight market state data, policy execution feedback data, and risk assessment data, standardizing them, and converting them into state vectors that the agent can recognize, synchronously associating them with policy defect labels marked in the simulation report. The policy decision-making module adopts a continuous action space design, which can output fine-grained adjustments to buying and selling thresholds, position ratios, and risk hedging ratios, rather than discrete decision results. The learning update module embeds an experience playback mechanism, storing 1000 sets of past "state-action-reward-next state" data samples, and randomly selecting samples for training to improve learning efficiency. The iterative optimization process strictly follows the practical logic and is executed in four steps: Step 1: Initialization. This involves loading strategy defect data from the simulation verification report, setting initial agent parameters, and defining parameter adjustment boundaries. Step 2: Trial and error learning. The agent executes trades in a simulated trading environment according to the current strategy, performing 100 trades per batch, collecting status data and reward values ​​in real time, and recording the effect of parameter adjustments. Step 3: Parameter update. This involves optimizing the strategy network parameters using gradient descent, prioritizing the correction of high-priority defects marked in the simulation report, while simultaneously retaining valid parameters. Step 4: Termination judgment. When the reward value fluctuation range is less than 3% for five consecutive batches, the core indicators meet the standards, and no new defects are added, the iteration stops, and the optimized strategy parameter set is output.

[0032] See Figure 8 As shown, the optimization strategy is deployed to the actual system, and a real-time monitoring module is built to synchronously capture market, strategy, and risk data. When the indicators deviate from expectations, a secondary optimization process is triggered, which specifically includes: The optimized freight rate trading strategy is deployed to the actual freight trading system. A real-time data acquisition and monitoring module is established to capture real-time market data, strategy execution data, and risk indicator data, and dynamically compare them with the optimized strategy parameters. When the market experiences abnormal fluctuations, the weights of core features change, or the strategy's execution effect deviates from the expected target, a secondary optimization process is triggered to perform a closed-loop iteration of the strategy.

[0033] Specifically, the Level 1 warning targets minor deviation scenarios. When the fluctuation range of core indicators is within 5%, or the execution efficiency of a single route strategy declines slightly, the system automatically triggers parameter fine-tuning instructions, with the adjustment range controlled within ±1%, without manual intervention. The Level 2 warning targets severe deviation scenarios. When the fluctuation of indicators exceeds 5%, extreme market fluctuations occur, the weight change of core features exceeds 10%, or the strategy execution effect deviates from the expected target for three consecutive working days, the system immediately issues an alarm signal, freezes the trading permissions of high-risk routes, and triggers the secondary optimization process of the strategy. The closed-loop iterative optimization strictly connects with the previous steps to ensure dynamic adaptation of the strategy. After the secondary optimization process is initiated, the feature set is updated based on monitoring data, the hybrid prediction model is reconstructed and retrained to generate an initial strategy adapted to the new market environment. Through simulation verification and risk assessment, the updated strategy is obtained through reinforcement learning optimization. After optimization, the original strategy is replaced by canary deployment, while retaining historical strategy execution data and optimization records to form a closed-loop link of "deployment-monitoring-early warning-optimization". The monitoring focus is optimized according to different scenarios. For ocean shipping routes, long-term freight rate trend monitoring is strengthened, and for hazardous chemical routes, compliance indicators are focused to ensure that the strategy always keeps pace with the dynamic changes in the freight market and maintains the best trading results.

[0034] It should be noted that the order of the above embodiments of the present invention is merely for descriptive purposes and does not represent the superiority or inferiority of the embodiments. Furthermore, the above description focuses on specific embodiments of this specification. Additionally, the processes depicted in the accompanying drawings do not necessarily require a specific or sequential order to achieve the desired results. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0035] The various embodiments in this specification are described in a progressive manner. The same or similar parts between the various embodiments can be referred to each other. Each embodiment focuses on describing the differences from other embodiments.

[0036] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for optimizing freight rate trading strategies based on machine learning, characterized in that, include: We collect four types of data: freight supply and demand, macroeconomics, regional logistics support, and historical transactions. We then use an adaptive threshold algorithm to filter out outliers and time-series interpolation to fill in missing values. After standardizing the heterogeneous data, we build a basic database with a timestamp index. Extract time-series, correlation, and trend features from preprocessed data, use mutual information entropy and random forest algorithms to screen key features, remove redundant items, and establish a dynamic update mechanism; A freight rate prediction model based on a hybrid machine learning model is constructed, which integrates a long short-term memory network and a gradient boosting tree model. LSTM captures the long-term temporal dependence of freight rates, and XGBoost fits the nonlinear feature associations. Based on the prediction results, a multi-objective function is constructed, with constraints such as transportation capacity and capital. The function is solved by a genetic algorithm to generate a differentiated initial strategy containing parameters such as buy and sell thresholds and position ratios. Set up a simulated trading environment, input historical and real-time data to test the initial strategy, use the value at risk model and stress testing to evaluate the risk resistance capability in multiple scenarios, quantify the stability of returns and the effectiveness of risk control, and generate a report. Based on the verification report, a reinforcement learning agent is constructed. Market conditions, strategy results, and risk indicators are used as inputs. With the goal of improving returns and reducing risks, the strategy parameters are adaptively adjusted. The optimization strategy is deployed to the actual system, and a real-time monitoring module is built to capture market, strategy and risk data in real time. When the indicators deviate from the expectations, a secondary optimization process is triggered.

2. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The process of collecting four data points—freight supply and demand, macroeconomics, regional logistics support, and historical transactions—and then using an adaptive threshold algorithm to filter outliers, time-series interpolation to fill in missing values, and standardizing heterogeneous data to construct a basic database with a timestamp index, specifically includes: Collect multi-source freight rate correlation data, which covers four dimensions: freight market supply and demand data, macroeconomic data, regional logistics support data, and historical transaction data. The data preprocessing process employs an adaptive threshold algorithm to filter out abnormal data, fills in missing values ​​using a data completion model, standardizes heterogeneous data, constructs a unified format freight rate database, and establishes a data timestamp index.

3. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The process of extracting time-series, correlation, and trend features from preprocessed data, using mutual information entropy and random forest algorithms to filter key features, eliminating redundant items, and establishing a dynamic update mechanism specifically includes: Based on the preprocessed basic data, three core features are extracted: time-series features, correlation features, and trend features. Time-series features include daily fluctuation range of freight rates, weekly change cycle, monthly trend and quarterly seasonal fluctuation pattern. Correlation features include the correlation between supply and demand gap and freight rates, the impact coefficient of macroeconomic indicators on freight rates, and the linkage between regional logistics support capacity and freight rates. Trend features include the duration of freight rate increases / decreases, the precursor features of inflection points, and the characteristics of sudden changes in freight rates under extreme market conditions. The mutual information entropy combined with the random forest algorithm is used to screen key features, eliminate redundant and weakly correlated features, and construct a feature set. Establish a dynamic feature update mechanism to adjust feature dimensions and weights in real time based on market changes.

4. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The construction of the freight rate prediction model based on a hybrid machine learning model, which integrates a long short-term memory network and a gradient boosting tree model, uses LSTM to capture long-term time-series dependence of freight rates, and XGBoost to fit nonlinear feature associations, specifically includes: A freight rate prediction model based on a hybrid machine learning model is constructed by fusing a long short-term memory network and a gradient boosting tree to build the hybrid prediction model. The LSTM model is used to capture long-term dependencies in freight rate time series data and explore the inheritance pattern of freight rates in different time periods. The XGBoost model is used to capture the non-linear correlation between freight rates and various core features and strengthen the influence of key features on freight rate prediction results. An attention mechanism is introduced during model construction to assign higher weights to time period data and features that affect freight rate fluctuations, and a cross-validation method is used to divide the training set, validation set and test set.

5. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The process of constructing a multi-objective function based on the prediction results, using constraints such as transportation capacity and capital, and solving it through a genetic algorithm to generate a differentiated initial strategy containing parameters such as buy / sell thresholds and position ratios, specifically includes: Based on the freight rate forecast results of the hybrid model, a multi-objective optimization objective function is constructed in combination with the needs of freight transaction scenarios; The optimization objectives include maximizing freight revenue, minimizing transaction risk, and optimizing capital utilization. The constraints include capacity supply limitations, capital turnover cycle, market liquidity constraints, and policy compliance requirements. The objective function is solved by a genetic algorithm to generate an initial trading strategy, which includes the freight rate buy threshold, sell threshold, position ratio, trading frequency and risk hedging scheme. Differentiated strategy parameters are set for different types of cargo and different routes to adapt to diverse freight transaction scenarios.

6. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The process of setting up a simulated trading environment, inputting historical and real-time data to test the initial strategy, using a value at risk model and stress testing to evaluate the risk resistance capability in multiple scenarios, quantifying the stability of returns and the effectiveness of risk control, and generating a report specifically includes: A simulated freight trading environment was set up, historical data and real-time incremental data were input into the simulation system, simulated trading was executed according to the initial trading strategy, and the profit data, risk data and strategy execution efficiency were recorded during the trading process. The risk assessment uses a VaR model combined with stress testing to evaluate the strategy's risk resilience under different scenarios such as normal market fluctuations, extreme market shocks, and policy adjustments. It calculates the Sharpe ratio, maximum drawdown rate, and win rate indicators to quantify the strategy's return stability and risk control capabilities, and generates a simulation verification report.

7. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The construction of a reinforcement learning agent based on the verification report, taking market state, strategy results, and risk indicators as inputs, and aiming to improve returns and reduce risks, adaptively adjusts strategy parameters, specifically including: Based on the dynamic optimization of policies using reinforcement learning, and using simulation verification reports as a basis, a reinforcement learning agent is constructed. The state of the freight market, the results of strategy execution, and risk indicators are used as the input state of the intelligent agent. The reward function is to increase profits and reduce risks. The trading strategy parameters are continuously adjusted through trial and error learning. During the optimization process, a market adaptive mechanism is introduced. The agent captures changes in the market environment in real time, dynamically adjusts the freight rate buying and selling thresholds, the frequency of position adjustment, and the risk hedging ratio, corrects parameters in the initial strategy that do not match the actual market demand, retains the core logic of the strategy, and realizes dynamic iterative optimization of the strategy.

8. The method for optimizing freight rate trading strategies based on machine learning according to claim 1, characterized in that, The process of deploying optimization strategies to the actual system, building a real-time monitoring module to synchronously capture market, strategy, and risk data, and triggering a secondary optimization process when indicators deviate from expectations specifically includes: The optimized freight rate trading strategy is deployed to the actual freight trading system. A real-time data acquisition and monitoring module is established to capture real-time market data, strategy execution data, and risk indicator data, and dynamically compare them with the optimized strategy parameters. When the market experiences abnormal fluctuations, the weights of core features change, or the strategy's execution effect deviates from the expected target, a secondary optimization process is triggered to perform a closed-loop iteration of the strategy.

9. A machine learning-based freight rate trading strategy optimization system, used to implement the machine learning-based freight rate trading strategy optimization method as described in any one of claims 1-8, characterized in that, include: Data acquisition module: responsible for acquiring multi-source data, filtering outliers through adaptive threshold algorithm, completing missing data through time series interpolation, standardizing heterogeneous data, and building a basic database with timestamp index; Feature engineering module: Extracts time series, correlation and trend features from preprocessed data, uses mutual information entropy and random forest algorithm to screen key features, and dynamically updates feature set to adapt to market changes; Hybrid prediction model module: It integrates LSTM and XGBoost models. LSTM captures the long-term time-series dependence of freight rates, XGBoost fits non-linear feature correlations, introduces an attention mechanism to strengthen the weights of key features, and improves generalization ability through cross-validation. Strategy generation module: Constructs a multi-objective function based on the prediction results, and uses a genetic algorithm to generate differentiated initial strategies with constraints such as transportation capacity and capital. Simulated trading and risk assessment module: Build a simulated environment to test strategies, combine VaR model and stress test to evaluate risk resistance capability, and quantify return stability and risk control effectiveness; Reinforcement learning optimization module: Based on the validation report, a reinforcement learning agent is constructed, the strategy parameters are dynamically adjusted, and adaptive optimization is achieved through feedback from market status and risk indicators; Closed-loop iteration module: Deploys optimization strategies to the actual system, monitors market data and strategy performance in real time, and triggers a secondary optimization process when indicators deviate from expectations; Processor: The processor is used to handle the calculation process of each formula and the construction calculation process of each model.