A neural network-based API interface request adaptive scheduling method, device and system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a neural network-based adaptive scheduling method for API requests, and leveraging an LSTM model to predict the probability of API errors and dynamically adjust the request strategy, the problem of high API request failure rate and low resource utilization in high-concurrency scenarios is solved, thereby improving the real-time performance and stability of API requests.

CN122309141APending Publication Date: 2026-06-30CHINA LIFE INSURANCE CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: CHINA LIFE INSURANCE CO LTD
Filing Date: 2026-03-23
Publication Date: 2026-06-30

Smart Images

Figure CN122309141A_ABST

Patent Text Reader

Abstract

This application provides a neural network-based adaptive scheduling method, apparatus, and system for API interface requests, aiming to solve the technical problem that existing fixed-delay retry strategies cannot dynamically adapt to changes in interface load, resulting in high request failure rates and poor real-time performance. The method includes: collecting request logs containing preset types of interface error information; constructing time-series features; training an interface error prediction model using a time-series prediction model to predict the probability of a specific error occurring within a future window; inputting real-time logs into the model to obtain predicted error probabilities; and dynamically adjusting the interval and / or concurrency of subsequent requests based on these predicted values. Through predictive scheduling and adaptive adjustment, this application significantly improves the request success rate and resource utilization of API interfaces in high-concurrency scenarios.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and in particular to an adaptive scheduling method, apparatus and system for API interface requests based on neural networks. Background Technology

[0002] "One-stop" commercial insurance settlement is a new integrated settlement model combining "medical insurance + commercial insurance + out-of-pocket payment." It connects the data, business, and financial channels between medical institutions and insurance companies, solving the problem of isolated medical information and achieving interconnectivity. Commercial insurance patients no longer need to photocopy materials, pay treatment fees, or wait for reimbursement, significantly improving service timeliness and accessibility. To achieve the exchange and sharing of medical insurance and commercial insurance data, it is necessary to periodically obtain medical insurance data through a dedicated medical insurance exchange interface. However, in existing technologies, due to varying degrees of limitations in the medical insurance system regarding interface performance and concurrency, API requests frequently encounter errors. To ensure real-time data access, API requests continue to send data requests frequently, leading to a continuous increase in the load on the data interface and exacerbating the error situation. This cross-system data exchange scenario is essentially based on API (such as RESTful API, SOAP API) data transmission protocols, including mechanisms such as authentication, data format standardization, and request / response message structures. The interface performance limitations correspond to technical bottlenecks such as the request queue capacity (e.g., Queue Size), resource pool management (e.g., thread pool, connection pool), and rate limiting strategies (e.g., Token Bucket, Sliding Window). Request errors are specifically manifested as error codes returned by the server (e.g., 503 Service Unavailable, 429 Too Many Requests) or exception type markers in the system logs.

[0003] In existing technologies, the number of failed requests is typically dynamically assessed. When a preset threshold is reached, a delay is set before resending the interface request. This strategy falls under the fixed-backoff method within the client-side retry mechanism. While this approach alleviates performance degradation issues for medical insurance data interfaces to some extent, the fixed request delay cannot be dynamically adjusted based on interface limitations and load conditions. This leads to a situation where too short a delay increases the load on the data requesting end, preventing data acquisition, while too long a delay wastes excessive interface waiting time, reducing the real-time performance of data downloads. Here, "load conditions" refers to server monitoring metrics such as CPU utilization, memory usage, request response time, and error rate.

[0004] The aforementioned technical issues are not unique to the healthcare industry. In cross-system data exchange scenarios across various fields such as e-commerce, financial payments, logistics tracking, IoT device management, and cloud computing service invocation, server-side API interfaces also face performance bottlenecks and resource limitations under high-concurrency requests. Frequent client requests to ensure data timeliness also exacerbate interface load, creating a vicious cycle. Furthermore, existing simple and fixed delay retry strategies are unable to adapt to the dynamically changing interface load in these scenarios, leading to high request failure rates, resource waste, and reduced real-time performance. Therefore, the application scenario of this application should be understood as a broad range of API interface data exchange environments requiring high reliability and real-time performance.

[0005] In cross-system data exchange and sharing scenarios based on API interfaces, existing technologies generally face the following technical challenges when dealing with high-concurrency requests, interface performance limitations, and load balancing issues:

[0006] Server-side data interface performance bottlenecks: Different systems have different requirements for interface performance and concurrent processing capabilities, such as full request queues and insufficient resources. This leads to frequent errors from the interface in high-concurrency data request scenarios, failing to meet the real-time data interaction needs in a timely manner. Frequent client requests worsen interface load: To ensure data timeliness, data requesting clients typically send continuous requests to the interface. Excessively frequent requests further increase the interface load, creating a vicious cycle that causes prolonged interface response time or request failures.

[0007] Limitations of Fixed Delay Strategies on Clients: Existing technologies often use fixed delay strategies to handle API request errors. This involves setting a fixed delay time to resend the request after the number of failed requests reaches a preset threshold. However, fixed delays cannot dynamically adapt to API load conditions, potentially leading to two scenarios: Too short a delay: The API has not yet recovered, resulting in a persistently high request failure rate and exacerbating load issues. Too long a delay: API resources and waiting time are wasted, reducing the real-time performance of data interaction.

[0008] The aforementioned problems not only exist in data exchange scenarios within specific industries, but also widely appear in other applications that require real-time data interaction across systems and domains, such as e-commerce, logistics management, and financial transactions. How to effectively manage interface performance under high-concurrency requests, dynamically optimize request strategies, and improve the stability and real-time performance of data interaction has become a pressing technical challenge.

[0009] Therefore, in cross-system API interface data exchange scenarios, due to the performance bottlenecks and concurrency limitations of the server-side interface, if the client adopts a fixed delay retry strategy, it cannot adapt to changes in interface load, resulting in technical problems such as increased interface request failure rate, low resource utilization, and insufficient real-time data interaction. Summary of the Invention

[0010] This application proposes an adaptive scheduling method, apparatus, and system for API interface requests based on neural networks, which solves the problems of existing technologies being unable to dynamically adjust request strategies according to interface load, resulting in low interface resource utilization, unstable request success rate, and insufficient real-time data interaction.

[0011] Firstly, this application provides an API request adaptive scheduling method based on a neural network. The method involves collecting API request log data over a historical time period. The request log data includes at least a request timestamp and preset types of interface error information representing the reasons for request failure. These preset types of interface error information include error codes or error type identifiers triggered by server resource limitations. Based on the request log data, a time-ordered feature sequence is constructed. This feature sequence contains features representing the interface load status. A time series prediction model is used to train the feature sequence to obtain an interface error prediction model. This model is used to predict the probability of the API interface experiencing the preset type of interface error in a future time window based on the feature sequence of historical time windows. Real-time request log data within the current time window is obtained and input into the interface error prediction model to obtain a predicted probability value for the next time window. Based on this predicted probability value, the number and / or timing of data requests sent to the API interface are dynamically adjusted.

[0012] In some embodiments, the dynamically adjusted scheduling strategy may specifically include a request interval duration and / or an upper limit for the number of concurrent requests; in response to an increase in the predicted value, increasing the request interval duration and / or decreasing the upper limit for the number of concurrent requests; and / or, in response to a decrease in the predicted value, decreasing the request interval duration and / or increasing the upper limit for the number of concurrent requests.

[0013] In some embodiments, the error code corresponding to the preset type of interface error information is returned by the API interface server when resources are insufficient or the request queue is saturated.

[0014] In some embodiments, the time series prediction model is a long short-term memory network model; the training process is optimized using a joint loss function, which includes a classification loss term for predicting the probability of interface error occurrence and a regression loss term for predicting the recommendation request interval.

[0015] In some embodiments, the method further includes an online model update step: continuously collecting the latest API interface request log data; and updating and training the interface error prediction model using the latest collected log data at a preset period.

[0016] In some embodiments, before training the feature sequence using the time series prediction model, the method further includes: optimizing the initial weights of the time series prediction model using a metaheuristic optimization algorithm. Further, the metaheuristic optimization algorithm may be the Red Deer algorithm, which searches for the optimal weights of the time series prediction model in the search space by simulating the natural competitive behavior of a red deer.

[0017] In some embodiments, the time series prediction model is a multi-model ensemble system, including at least two different types of time series prediction models; predicting the probability of the API interface experiencing the preset type of interface error within a future time window includes: each time series prediction model outputting a predicted probability; and weighting and fusing the predicted probabilities of each model based on their recent prediction performance metrics to obtain the final predicted value of the interface error occurrence probability. Further, the at least two different time series prediction models include at least two of the following: a long short-term memory network model, a gated recurrent unit model, and a temporal convolutional network.

[0018] In some embodiments, the method further includes a real-time circuit breaker step: monitoring the real-time failure rate of sending data requests to the API interface within the current time window; if the real-time failure rate exceeds a preset threshold, ignoring the output of the interface error prediction model, and directly enabling a preset protective scheduling strategy, wherein the request frequency of the protective scheduling strategy is lower than the request frequency under the normal adjustment strategy.

[0019] In some embodiments, before training the feature sequence using the time series prediction model, the method further includes: clustering multiple API interfaces based on historical call characteristics and error patterns of each API interface to obtain at least two interface categories; the interface error prediction model includes sub-models corresponding to each interface category; obtaining real-time request log data within the current time window and inputting it into the interface error prediction model includes: identifying the interface category to which the target API interface to be requested belongs, and inputting the real-time request log data into the corresponding sub-model. Further, the historical call characteristics and error patterns may include: average call frequency, response time distribution characteristics, error type distribution, and the temporal regularity of error occurrence.

[0020] This application embodiment also provides an API interface request adaptive scheduling device based on a neural network, used to implement the method described in any of the preceding claims, comprising: a data acquisition module, used to acquire API interface request log data within a historical time period, the request log data including at least a request timestamp and interface error information of a preset type representing the reason for request failure, the preset type of interface error information including an error code or error type identifier triggered by server resource limitations; a sequence construction module, used to construct a time-ordered feature sequence based on the request log data, the feature sequence containing features representing the interface load state; a model training module, used to train the feature sequence using a time series prediction model to obtain an interface error prediction model, the interface error prediction model being used to predict the probability of the API interface experiencing the preset type of interface error in a future time window based on the feature sequence of historical time windows; a real-time prediction module, used to acquire real-time request log data within the current time window, input it into the interface error prediction model, and obtain a predicted value for the probability of interface error occurrence for the next time window; and a strategy scheduling module, used to dynamically adjust the number and / or time of sending data requests to the API interface based on the predicted value of interface error occurrence.

[0021] In some embodiments, the apparatus further includes a model update module, configured to continuously collect the latest API interface request log data and update and train the interface error prediction model using the latest collected log data at a preset period.

[0022] In some embodiments, the device further includes a circuit breaker control module, which monitors the real-time failure rate of sending data requests to the API interface within the current time window, and when the real-time failure rate exceeds a preset threshold, ignores the output of the interface error prediction model and directly enables a preset protective scheduling strategy.

[0023] This application also provides an API interface request adaptive scheduling system, comprising: a request-end device or an independent scheduling server, deployed with the neural network-based API interface request adaptive scheduling device as described in any of the foregoing embodiments, or configured to execute the neural network-based API interface request adaptive scheduling method as described in any of the foregoing embodiments; at least one client application, deployed on the request-end device, for initiating API interface call requests; at least one API interface server, for responding to the call requests; wherein, the API interface request adaptive scheduling device is located on the request path between the client application and the API interface server, for intercepting and intelligently scheduling requests from the client application.

[0024] This application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method described in any of the preceding embodiments.

[0025] This application also provides an electronic device, including: a memory storing a computer program; and a processor configured to execute the computer program stored in the memory to implement the method described in any of the preceding embodiments.

[0026] The at least one technical solution adopted in this application can achieve the following beneficial effects: through time series analysis based on LSTM models, it is possible to dynamically learn and predict the performance limiting behavior of interfaces (such as concurrency bottlenecks and request intervals), and intelligently adjust the time interval and concurrency strategy of API requests. Compared with fixed delay strategies, it is more flexible and can adapt to the interface status in real time, thereby significantly improving the success rate of interface requests and resource utilization. Its core concept lies in building an intelligent API request scheduling system based on time series prediction. Through machine learning models (such as LSTM), it learns and predicts the performance limiting behavior of interfaces, dynamically adjusts client request strategies (such as request intervals and concurrency), so as to achieve predictive scheduling, adaptive adjustment and continuous optimization, integrating three major technical elements: time series analysis, machine learning prediction, and dynamic parameter adjustment. Attached Figure Description

[0027] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings:

[0028] Figure 1 This is a flowchart of a core embodiment of the method described in this application;

[0029] Figure 2 This is a flowchart illustrating an embodiment of the method in this application that includes online model updates;

[0030] Figure 3 This is a flowchart illustrating an embodiment of the method in this application that includes multi-model integration;

[0031] Figure 4 This is a flowchart of an embodiment of the method of this application that includes a real-time circuit breaker step;

[0032] Figure 5 This application includes a flowchart of the fine-grained classification and adaptive scheduling of API interfaces;

[0033] Figure 6 This is a structural diagram of an embodiment of the device described in this application;

[0034] Figure 7This is a structural diagram of an embodiment of the system described in this application. Detailed Implementation

[0035] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0036] The technical solutions provided by the various embodiments of this application are described in detail below with reference to the accompanying drawings.

[0037] Example 1: Core Method Flow

[0038] Reference Figure 1 The above is a flowchart of a core embodiment of the method described in this application. The method includes the following steps.

[0039] Step 110: Collect API interface request log data within a historical time period. The request log data includes at least the request timestamp and interface error information of a preset type that indicates the reason for request failure.

[0040] The preset type of interface error information includes error codes or error type identifiers triggered by server resource limitations.

[0041] Specifically, this step collects data from API request logs (which may be in the form of a database or text file). The request logs include error types (such as Query queue full, Insufficient resources), request timestamps (accurate to seconds or minutes), request intervals (the time interval between the last request, usually corresponding to the time interval of a scheduled task), interface response time (ms), response data size (typically 1000 records, but may be 0 records if the interface fails), and request parameters (such as the request body information of a POST request, which records the interface type and specific request parameters). The default interface errors here are error codes returned by the server (such as 503 Service Unavailable, 429 Too Many Requests) or exception type markers in the system logs, such as Query queue full or Insufficient resources. These error codes are returned by the API interface server when resources are insufficient or the request queue is saturated.

[0042] Step 120: Based on the request log data, construct a time-ordered feature sequence, which includes features characterizing the interface load status.

[0043] This step involves preprocessing the request log data, specifically removing duplicate or invalid data, such as incomplete log entries or entries with missing fields. It standardizes the time format, normalizes request parameters (e.g., encoding different method types as categorical values), removes non-restricted error data (e.g., 4xx user request errors), and standardizes the time format and error type encoding (e.g., Query queue full maps to 1, Insufficient resources to 2, and no errors to 0). The log data can be divided into a training set (80%), a validation set (10%), and a test set (10%) in chronological order, ensuring the training set contains sufficient error samples. This ultimately forms multi-dimensional time-series feature data, including error type, request interval, concurrency, response time, and other features. In the above feature sequence, request interval, concurrency, and response time can all characterize the interface's load status.

[0044] Step 130: Train the feature sequence using a time series prediction model to obtain an interface error prediction model.

[0045] The interface error prediction model is used to predict the probability of the API interface experiencing the preset type of interface error within a future time window based on the feature sequence of historical time windows.

[0046] This application employs an LSTM (Long Short-Term Memory) network as the time series prediction model. The model structure includes an input layer, an LSTM layer, a fully connected layer, and an output layer. The input layer receives multi-dimensional time series input; the LSTM layer captures time series features and the temporal patterns of error occurrence; the fully connected layer maps the LSTM output to predicted values; and the output layer provides error type prediction (for classification tasks, outputting the probability of occurrence) and / or request interval prediction (for regression tasks, outputting suggested time intervals). The training objective is a joint loss function, including a loss for the classification task and a loss for the regression task.

[0047] Specifically, the classification objective is to minimize the cross-entropy loss of the incorrect type prediction:

[0048] ;

[0049] The regression objective is to minimize the mean squared error (MSE) of the forecast for the request interval.

[0050] ;

[0051] The total loss function is ,in , This is a weighting coefficient used to balance the two tasks.

[0052] The training process uses the Adam (Adaptive Moment Estimation) optimizer to update model parameters and continuously adjusts hyperparameters (such as time window size, number of LSTM layers, and number of neurons) to optimize model performance. The output of the interface error prediction model is either error type prediction (for classification tasks, outputting the probability of occurrence) or request interval prediction (for regression tasks, outputting suggested time intervals).

[0053] Step 140: Obtain the real-time request log data within the current time window, input it into the interface error prediction model, and obtain the predicted probability value of interface error occurrence for the next time window.

[0054] Specifically, the duration of the time window (e.g., 1 hour, 2 hours, 3 hours) can be flexibly set according to the required prediction accuracy. If this step directly predicts the request interval within the time window, the request interval within that time window can be set directly based on that interval. If the prediction result indicates an API response limit, the request interval will be automatically increased when the prediction of a query queue full limit occurs, and appropriately reduced when the prediction of the limit disappears to improve data interaction efficiency.

[0055] Step 150: Based on the predicted probability of the interface error, dynamically adjust the number and / or timing of sending data requests to the API interface.

[0056] This dynamic adjustment strategy is the core of this application. It dynamically adjusts request parameters based on prediction results to avoid invalid requests and improve success rate and efficiency. For example, when the predicted value increases (i.e., the probability of an interface limitation is high), the request interval is increased and / or the upper limit of concurrent requests is decreased to reduce request frequency and proactively avoid interface overload. When the predicted value decreases (i.e., the probability of an interface limitation is low or performance is good), the request interval is decreased and / or the upper limit of concurrent requests is increased to increase request frequency and improve the real-time performance of data interaction. This contrasts sharply with the fixed-delay strategy in existing technologies, enabling adaptive adjustments based on the load conditions represented by server monitoring metrics (such as error rate and response time). This method solves the core technical problem of low request efficiency and poor real-time performance caused by the inability of fixed-delay strategies to adapt to dynamic interface load, forming a closed-loop process of "data collection and characterization → model training and prediction → dynamic strategy scheduling → feedback and iterative optimization."

[0057] Example 2: Process including online model updates

[0058] Reference Figure 2 The flowchart below shows an embodiment of the method in this application that includes online model updates. Building upon Embodiment 1, to ensure the model can adapt promptly to changes in API interface performance (e.g., server expansion or code optimization improving API response performance), this embodiment adds an online model update step, reflecting a continuous learning mechanism.

[0059] Step 210: Train the initial interface error prediction model (as in Steps 110-130 of Example 1).

[0060] Step 220: Continuously collect the latest API interface request log data.

[0061] This step forms the data foundation for online model updates. Since API performance, load patterns, and business access patterns may change over time (e.g., server upgrades, shifts during peak periods), continuously collecting the latest request log data captures these changes, ensuring that the data used to update the model reflects the current system state and behavior patterns.

[0062] Step 230: Update and train the interface error prediction model using the latest collected log data at a preset period (e.g., one week or 15 days).

[0063] This step involves the online update of the model. By periodically retraining or fine-tuning the existing model with new data, its parameters can be adapted to the new data distribution, thereby maintaining or even improving the accuracy of predicting future interface errors. Regularly updating the LSTM model online adapts to dynamic changes in interface performance, ensuring that the model can adapt to changes in API interface performance in a timely manner. Updating the model at this time allows for faster adaptation to better interface request latency settings.

[0064] Example 3: Process including weight optimization

[0065] Based on Example 1 or 2, in order to further optimize the training process and performance of the model, a metaheuristic optimization algorithm can be introduced to optimize the weights of the LSTM model before the model training described in step 130.

[0066] Specifically, the Red Deer Algorithm (RDA) can be used to optimize the initial weights of the time series prediction model. By simulating the natural competitive behavior of red deer to find the optimal solution, this helps the LSTM model find a better weight configuration during training, thereby improving the model's prediction accuracy for time series data. The LSTM model optimized by RDA can better capture complex patterns and trends in the data, enhancing its generalization ability on unknown data. RDA's randomness and global search capability help avoid the LSTM model getting trapped in local optima, reducing the risk of overfitting. As a metaheuristic algorithm, RDA can find a satisfactory solution in a relatively short time, which can accelerate the training process of the LSTM model, especially when dealing with large-scale datasets.

[0067] Example 4: A process including multi-model integration

[0068] Reference Figure 3 This is a flowchart illustrating an embodiment of the method in this application that includes multi-model integration. This embodiment aims to improve prediction accuracy and system robustness, avoiding bias or failure of a single model (such as LSTM) under specific data patterns. This embodiment is a specific optimized implementation of "training using a time series prediction model" described in step 130 of Embodiment 1, and represents an enhancement to the core prediction process.

[0069] Step 310: Construct a multi-model ensemble system as a time series prediction model. This system includes at least two different types of time series prediction models, such as a Long Short-Term Memory (LSTM) network model, a Gated Recurrent Unit (GRU) model, a Temporal Convolutional Network (TCN), or Prophet. These models are trained in parallel on the same set of historical feature sequences.

[0070] This step replaces or refines the model building part of step 130 in Example 1, expanding the single LSTM model into a heterogeneous model set. It aims to leverage the advantages of different model architectures to learn time series patterns from different perspectives, providing a diverse foundation for subsequent fusion prediction.

[0071] Step 320: When making predictions, each time series prediction model outputs its own prediction probability of a preset interface error occurring in the next time window, based on the real-time request log data within the current time window.

[0072] This step is a parallel execution of the prediction process in step 140, with each sub-model completing a prediction independently.

[0073] Step 330: Weight and fuse the predicted probabilities of each model to obtain the final, more reliable predicted probability of interface error occurrence.

[0074] Preferably, dynamic weights are calculated based on the recent prediction accuracy, stability, and other performance metrics of each model. Then, the prediction probabilities of each model are weighted and fused to obtain a final, more reliable prediction value for the probability of interface errors. This approach leverages model diversity to balance prediction accuracy and stability, reducing the risk of dependence on a single model. This step is a result fusion layer added to step 140. It does not change the core "input data-model-output prediction" process, but after obtaining multiple preliminary predictions, a decision layer (weighted fusion) integrates the opinions of each model to generate a more robust and accurate final prediction value to support the scheduling decision in subsequent step 150.

[0075] Example 5: A process including real-time circuit breaking

[0076] Reference Figure 4 This is a flowchart illustrating an embodiment of the method in this application that includes a real-time circuit breaker step. This embodiment provides a second layer of protection for the system, preventing a cascading failure effect when the predictive model fails to detect sudden and severe interface failures in a timely manner. This embodiment is a security monitoring and emergency handling process independent of the core predictive-scheduling main process of Embodiment 1. It runs in parallel with the main process and takes over control in extreme cases where the main process may fail, thus constituting a system robustness enhancement scheme.

[0077] Step 410: Monitor the real-time failure rate of sending data requests to the API interface within the current time window.

[0078] Monitoring metrics may also include key indicators such as the P99 percentile of response time.

[0079] This step is independent of any step in Implementation Example 1 and is a continuously running monitoring thread focused on observing the current, real-time results of request execution.

[0080] Step 420: Determine whether the real-time failure rate exceeds a preset threshold (e.g., the error rate exceeds 50% in the past 10 seconds).

[0081] This threshold setting constitutes an independent, fast circuit breaker rule based on sliding window statistics. This step is a decision point based on simple statistics, which contrasts with and complements the prediction decision based on a complex model in Example 1 (steps 140-150).

[0082] Step 430: If the threshold is exceeded, ignore the output of the interface error prediction model and directly enable the preset protective scheduling strategy.

[0083] This protective strategy is typically set to significantly extend the request interval to several minutes, reducing the concurrency to 1. The request frequency is significantly lower than under normal adjustment strategies, and the system enters emergency protection mode. Simultaneously, a probe task is initiated at an extremely low frequency, tentatively sending requests until several consecutive probes are successful. Then, the system gradually recovers and returns control to the predictive model. When triggered, this step completely overrides or bypasses the dynamic adjustment strategy generated in step 150, forcing the adoption of an extremely conservative but safe scheduling strategy aimed at quickly stabilizing the system. It is a safe bypass of the main process in Implementation Example 1. This provides the system with rapid response capabilities to sudden traffic spikes or service crashes, enhancing overall service availability.

[0084] Example 6: A process including fine-grained classification of interfaces

[0085] Figure 5 This application includes a flowchart of fine-grained API interface classification and adaptive scheduling. Building upon Embodiment 1, to address the issues of significant performance differences among different API interfaces and the poor performance of a unified model, and to achieve more refined resource scheduling, this embodiment adds an interface clustering step before model training. This embodiment is an extension and refinement of steps 130 (model training) and 140 (model prediction) in Embodiment 1. It adds interface classification and model selection steps to the core process, aiming to provide customized prediction models for different types of interfaces, thereby improving overall prediction and scheduling accuracy.

[0086] Step 610, Interface Profiling and Clustering: Based on the historical call characteristics and error patterns of each API interface, cluster multiple API interfaces to obtain at least two interface categories.

[0087] Historical call characteristics and error patterns may include: average call frequency, response time distribution characteristics, error type distribution, and the temporal regularity of error occurrence. For example, using clustering algorithms such as K-means, feature analysis can be performed on all API interfaces to classify them into several categories (such as "high-frequency sensitive", "low-frequency heavy", "stable"). This step is a preprocessing extension of Example 1, classifying the data sources (different API interfaces) before constructing the general feature sequence.

[0088] Step 620, Hierarchical Modeling: For each interface category obtained by clustering, construct or fine-tune a time series prediction sub-model. The final interface error prediction model is a collection of multiple sub-models corresponding to the interface category.

[0089] Specifically, a baseline LSTM model can be trained based on all interface logs as a general model, and then a separate LSTM model (or a fine-tuned general model) can be trained for each interface category as a specialized model to learn the specific behavioral patterns of that type of interface. This step is a refinement of step 130, changing the training of "one" model into training "a set" of targeted models.

[0090] Step 630, Layered Scheduling: When prediction is required, first identify the interface category to which the target API interface to be requested belongs, and then input the real-time request log data into the corresponding sub-model for prediction.

[0091] For newly added interfaces or those with very few calls, the prediction results of the general model are used by default. Interface profiles can also be re-evaluated periodically; when interface behavior patterns change, they are adjusted to a new category, triggering an update of the corresponding model. This step extends step 140, adding a step of "selecting the corresponding sub-model based on the interface category" between "acquiring data and inputting the model." This hierarchical scheduling approach achieves a "tailored approach," providing more precise scheduling strategies for interfaces with different performance characteristics, further improving overall efficiency and resource utilization.

[0092] It should be noted that the execution entities for the various steps provided in the above embodiments can be the same device or different devices. For example, data acquisition and preprocessing can be performed by a log server, model training and prediction can be performed by a scheduling server, and the final request can be sent by the client.

[0093] Example 7: Device Example

[0094] Reference Figure 6 This is a structural diagram of an embodiment of the device described in this application. An API interface request adaptive scheduling device 500 based on a neural network, used to implement any of the aforementioned method embodiments, includes:

[0095] The data acquisition module 51 is used to perform the function of collecting historical API interface request log data as described in step 110.

[0096] Sequence construction module 52 is used to perform the function of constructing a time-ordered feature sequence as described in step 120.

[0097] The model training module 53 is used to perform the function of training the interface error prediction model as described in step 130.

[0098] The real-time prediction module 54 is used to perform the function of acquiring real-time data and obtaining predicted values as described in step 140.

[0099] The strategy scheduling module 55 is used to perform the function of dynamically adjusting the request strategy as described in step 150.

[0100] In some embodiments, the apparatus 500 may further include a model update module ( Figure 5 (not shown in the figure), used to perform the online model update function as described in steps 220-230 of the embodiment.

[0101] In some embodiments, the device 500 may further include a fuse control module ( Figure 5 (not shown in the figure), used to perform the real-time fuse control function as described in steps 410-430 of the embodiment.

[0102] Example 8: System Example

[0103] Reference Figure 7 This is a structural diagram of an embodiment of the system described in this application. An API interface request adaptive scheduling system 700 includes: an API interface request adaptive scheduling device 500 as described in Embodiment 7, deployed on a requesting device 71 or a separate scheduling server 72; at least one client application, deployed on the requesting device, for initiating API interface call requests; and at least one API interface server 73 for responding to the call requests. The API interface request adaptive scheduling device 500 is located on the request path between the client application and the API interface server 73, for intercepting requests from the client application, intelligently scheduling them according to the aforementioned method, and then sending the requests to the API interface server 73. This system can be applied to technical application products such as medical data front-end exchange platforms.

[0104] Database / Log Server 74: Used to store historical API request log data, providing the data source needed for training and updating predictive models in the scheduling unit. It is typically connected directly to the scheduling server.

[0105] The requesting device has a client application deployed on it (the requesting device usually refers to a client or terminal, but depending on the specific deployment mode, it may also be an intermediate server). The API interface request adaptive scheduling device 500 is set on the request path between the client application and the API interface server 73, and has three system application modes:

[0106] Method 1: The requesting device accesses the AIP interface server via the network, and the adaptive scheduling device is set on the client device;

[0107] Method 2: The requesting device accesses the AIP interface server via the network, and the adaptive scheduling device is set on the AIP interface server;

[0108] Method 3: The requesting device accesses a dedicated scheduling server via the network, and the adaptive scheduling device is located on the scheduling server. The scheduling server provides middleware proxy for the requesting device and the API interface server. The requesting device can initiate API call requests to the scheduling server (or local proxy) via the network. The scheduling server (optional) is deployed as an independently owned node, running the API interface request adaptive scheduling device (i.e., the core algorithm module of this application). It receives requests from the requesting device and executes prediction and scheduling logic. The API interface server provides the backend server for the target API service. It receives requests from the scheduling device (or those forwarded by it) and returns responses.

[0109] It should be noted that this application uses LSTM as the core time series analysis model, but similar functions can also be achieved in other ways: for example, using a sliding window to count the error rate and response time in historical request logs, and setting threshold rules to dynamically adjust the request strategy. It is simple to implement and suitable for scenarios with small error rates and load fluctuations, but this method lacks the ability to predict dynamic changes in interface performance and has limited effectiveness in high-concurrency and complex scenarios. Other machine learning models can also be introduced, such as: (1) ARIMA (Autoregressive Integral Moving Average Model), used for time series analysis to predict the time pattern of interface error occurrence, but it is not capable of handling nonlinear and high-dimensional features. (2) Decision Tree / Random Forest, which trains a classification model through request features (such as time, concurrency, response time, etc.) to predict restricted behavior, but it is poor at capturing long-term dependencies in time series. (3) Dynamic load balancing based on distributed request queues, which introduces distributed request queue middleware, optimizes request traffic by dynamically adjusting queue depth and processing rate, and can achieve basic load balancing without a prediction model, but it cannot perform fine-grained optimization for specific interface types or time patterns.

[0110] Other embodiments: computer-readable storage media and electronic devices

[0111] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0112] Therefore, this application also proposes a computer-readable storage medium having a computer program stored thereon that, when executed by a processor, implements the methods described in any embodiment of this application.

[0113] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0114] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0115] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0116] Furthermore, this application also proposes an electronic device (or computing device) including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the method described in any embodiment of this application. In a typical configuration, the computing device includes one or more processors (CPUs), input / output interfaces, a network interface, and memory. Memory may include non-persistent memory in computer-readable media, random access memory (RAM), and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media. Computer-readable media includes both permanent and non-persistent, removable and non-removable media, and information storage can be implemented by any method or technology. Information may be computer-readable instructions, data structures, program modules, or other data.

[0117] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0118] Those skilled in the art will understand that, unless specifically stated otherwise, the singular forms “a,” “an,” “the,” and “the” used herein may also include the plural forms. It should be further understood that the term “comprising” as used in this application’s specification means the presence of the stated features, integers, steps, operations, elements, and / or components, but does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and / or groups thereof. It should be understood that when an element is “connected” to another element, it may be directly connected to the other element, or there may be an intermediate element. Furthermore, the term “connected” as used herein may include wireless connections. The term “and / or” as used herein includes all or any unit and all combinations of one or more associated listed items.

[0119] Those skilled in the art will understand that, unless otherwise defined, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains.

[0120] The above description is merely an embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.

Claims

1. An adaptive scheduling method for API interface requests based on neural networks, characterized in that, include: Collect API interface request log data within a historical time period. The request log data includes request timestamps and interface error information of a preset type that represents the reason for request failure. Based on the request log data, a time-ordered feature sequence is constructed, which contains features characterizing the interface load status. The feature sequence is trained using a time series prediction model to obtain an interface error prediction model. The interface error prediction model is used to predict the probability of the API interface experiencing the preset type of interface error in a future time window based on the feature sequence of a historical time window. Obtain real-time request log data within the current time window, input it into the interface error prediction model, and obtain the predicted probability value of interface error occurrence for the next time window; Based on the predicted probability of interface errors, the number and / or timing of sending data requests to the API interface are dynamically adjusted.

2. The method according to claim 1, characterized in that, The error code corresponding to the preset type of interface error information is returned by the API interface server when resources are insufficient or the request queue is saturated.

3. The method according to claim 1, characterized in that, The time series prediction model is a long short-term memory network model; the training process is optimized using a joint loss function, which includes a classification loss term for predicting the probability of interface errors and a regression loss term for predicting recommendation request intervals.

4. The method according to claim 1, characterized in that, The method also includes an online model update step: Continuously collect the latest API request log data; The interface error prediction model is updated and trained using the latest collected log data at a preset period.

5. The method according to claim 1, characterized in that, Before training the feature sequence using the time series prediction model, the method further includes: The initial weights of the time series prediction model are optimized using a metaheuristic optimization algorithm.

6. The method according to claim 1, characterized in that, The time series prediction model is a multi-model integrated system, including at least two different types of time series prediction models; The prediction of the probability of the API interface experiencing the preset type of interface error within a future time window includes: Each time series prediction model outputs a prediction probability; Based on the recent prediction performance metrics of each model, the prediction probabilities of each model are weighted and fused to obtain the final predicted value of the interface error occurrence probability.

7. The method according to claim 1, characterized in that, The method also includes a real-time circuit breaker step: Monitor the real-time failure rate of sending data requests to the API interface within the current time window; If the real-time failure rate exceeds a preset threshold, the output of the interface error prediction model is ignored, and a preset protective scheduling strategy is directly enabled. The request frequency of the protective scheduling strategy is lower than the request frequency under the normal adjustment strategy.

8. The method according to claim 1, characterized in that, Before training the feature sequence using the time series prediction model, the method further includes: Based on the historical call characteristics and error patterns of each API interface, multiple API interfaces are clustered to obtain at least two interface categories. The interface error prediction model includes sub-models corresponding to each interface category; The step of obtaining real-time request log data within the current time window and inputting it into the interface error prediction model includes: identifying the interface category to which the target API interface to be requested belongs, and inputting the real-time request log data into the corresponding sub-model.

9. A neural network-based adaptive scheduling device for API interface requests, used to implement the method described in any one of claims 1 to 8, characterized in that, include: The data acquisition module is used to collect API interface request log data within a historical time period. The request log data includes at least a request timestamp and interface error information of a preset type that represents the reason for request failure. The sequence construction module is used to construct a time-ordered feature sequence based on the request log data, wherein the feature sequence contains features characterizing the interface load status; The model training module is used to train the feature sequence using a time series prediction model to obtain an interface error prediction model. The interface error prediction model is used to predict the probability of the API interface experiencing the preset type of interface error in a future time window based on the feature sequence of a historical time window. The real-time prediction module is used to obtain real-time request log data within the current time window, input it into the interface error prediction model, and obtain the predicted probability value of interface error occurrence for the next time window. The strategy scheduling module is used to dynamically adjust the number and / or timing of sending data requests to the API interface based on the predicted probability of the interface error.

10. An adaptive scheduling system for API interface requests, characterized in that, include: The requesting device or a separate scheduling server is deployed with the API interface request adaptive scheduling device as described in claim 9, or configured to execute the method described in any one of claims 1 to 8; At least one client application, deployed on the requesting device, is used to initiate API interface call requests; At least one API interface server is used to respond to the call request; The API interface request adaptive scheduling device is set on the request path between the client application and the API interface server, and is used to intercept and intelligently schedule requests from the client application.