Resource scheduling methods and electronic devices

By identifying multi-dimensional matching data sources from a knowledge base within a cloud platform, and utilizing a predictive model to process the statistical and business-related characteristics of the target dataset, predictive resource scheduling information is generated. This solves the problem of cloud platform resource scheduling being unable to adapt to non-linear fluctuations in business load, achieving accurate resource prediction and scheduling, and improving business processing efficiency.

CN122309090APending Publication Date: 2026-06-30DAWNING CLOUD COMPUTING TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DAWNING CLOUD COMPUTING TECH CO LTD
Filing Date
2026-05-29
Publication Date
2026-06-30

Smart Images

  • Figure CN122309090A_ABST
    Figure CN122309090A_ABST
Patent Text Reader

Abstract

This invention provides a resource scheduling method and electronic device applicable to the field of cloud computing technology. The resource scheduling method includes: responding to a received retrieval request regarding cloud platform resource scheduling; determining a data source matching the request type from a knowledge base; the cloud platform including multiple nodes; the knowledge base including various data sources matching the request type; the data sources storing resource data, business operation data, and business attribute data of various services for the corresponding multiple nodes; determining at least one target data matching the retrieval request in multiple dimensions from the data sources, as a target dataset; the multiple dimensions including at least numerical and semantic dimensions; processing the statistical features of the target dataset and the business association features extracted from the business attribute data of various services using a predictive model to obtain predicted resource scheduling information; and the cloud platform using the predicted resource scheduling information to perform resource scheduling on multiple nodes.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of cloud computing technology, and more specifically to a resource scheduling method and electronic device. Background Technology

[0002] With the rapid development of cloud computing technology, the types of businesses supported by cloud platforms are becoming increasingly complex, such as e-commerce transactions, financial data processing, and industrial IoT monitoring. The resource requirements of different businesses fluctuate significantly, such as a surge in CPU usage during e-commerce promotional periods and peak changes in memory usage for offline computing at night. However, cloud platform resource scheduling often struggles to adapt to the non-linear fluctuations of business loads, resulting in insufficient prediction accuracy and low business processing efficiency. Summary of the Invention

[0003] In view of the above problems, the present invention provides a method and electronic device for improving resource scheduling.

[0004] According to a first aspect of the present invention, a resource scheduling method is provided, comprising: responding to receiving a retrieval request regarding cloud platform resource scheduling; determining, from a knowledge base, a data source matching the request type of the retrieval request, wherein the cloud platform includes multiple nodes, the knowledge base includes multiple data sources matching the request type, the data sources being used to store resource data, business operation data, and business attribute data of multiple services executable by the cloud platform for the corresponding multiple nodes, the resource data representing the available resources of the nodes, and the business operation data representing the operation status of services on the nodes; determining, from the data sources, at least one target data matching the retrieval request in multiple dimensions as a target dataset, wherein the multiple dimensions include at least numerical dimensions and semantic dimensions; processing the statistical features of the target dataset and the business association features extracted from the business attribute data of the multiple services using a prediction model to obtain predicted resource scheduling information, and the cloud platform using the predicted resource scheduling information to perform resource scheduling on the multiple nodes.

[0005] According to an embodiment of the present invention, in response to receiving a retrieval request regarding cloud platform resource scheduling, a data source matching the request type of the retrieval request is determined from the knowledge base, accurately locking the data source matching the request type. The data source stores resource data, business operation data, and business attribute data of various services executable by the cloud platform for multiple corresponding nodes. At least one target data matching the retrieval request is determined from the data source, matching it at least in both numerical and semantic dimensions. This target dataset can accurately match hard constraints such as resource thresholds, load ranges, and hardware specifications using the numerical dimension, and achieve semantic similarity matching of scheduling cases, fluctuation patterns, and business requirements using the semantic dimension, significantly improving the accuracy and comprehensiveness of the retrieval results. Subsequently, a predictive model is used to process the statistical characteristics of the target dataset and the business correlation characteristics of various services to obtain predicted resource scheduling information. The predictive model can capture complex nonlinear change patterns of business load based on real, highly correlated business operation data and resource data, rather than relying on simplified assumptions or global average data, thereby enhancing the response capability to sudden, periodic, and other nonlinear fluctuations. Meanwhile, the independent forecasting of a single business is extended to joint modeling of multiple businesses, which effectively alleviates the load distortion problem caused by mutual interference or collaboration between businesses, reduces forecasting errors, avoids resource overload or idleness caused by inaccurate forecasting, and thus improves the overall business processing efficiency.

[0006] According to an embodiment of the present invention, a mapping relationship between the request type of the retrieval request and the data source is pre-configured. The request type of the retrieval request includes at least one of the following: resource prediction request, scheduling scheme generation request, and anomaly warning request. According to the mapping relationship, the data source matched with the resource prediction request includes: historical resource data and historical prediction errors of similar historical services, wherein the similar historical services are associated with the target service in the retrieval request. According to the mapping relationship, the data source matched with the scheduling scheme generation request includes: scheduling case data of similar historical services and real-time resource data of multiple nodes. According to the mapping relationship, the data source matched with the anomaly warning request includes: historical anomaly handling schemes and business risk assessment data of similar historical anomalies, wherein the similar historical anomalies are associated with the target anomaly in the retrieval request.

[0007] According to an embodiment of the present invention, the statistical features include the statistical features of historical resource data and historical business operation data of historical similar businesses in the target dataset; the statistical features of the target dataset and the business association features extracted from the business attribute data of various businesses are processed using a prediction model to obtain predicted resource scheduling information, including: when the request type includes a resource prediction request, the statistical features, business association features and real-time resource data are processed using a prediction model to obtain resource prediction information, which serves as predicted resource scheduling information; when the request type includes an anomaly warning request, the resource prediction value in the resource prediction information is compared with a preset resource threshold to determine that the cloud platform is in an abnormal state; and an anomaly warning is generated based on the historical anomaly handling schemes and business risk assessment data in the target dataset, which serves as predicted resource scheduling information.

[0008] According to embodiments of the present invention, statistical features reflect historical operational patterns of resources, while business-related features reflect differences in business needs. The combination of these two features enables the predictive model to accurately capture resource change trends, avoiding the limitations of a single data dimension. This allows for accurate prediction of resource needs, providing a reliable basis for scheduling decisions, enabling the rational allocation of resources in advance, avoiding resource idleness or shortages, and improving resource utilization efficiency. Comparing the predicted resource values ​​in the resource prediction information with preset resource thresholds can quickly locate anomalies such as resource overload and shortage. Historical anomaly handling solutions provide mature references for handling, reducing trial-and-error costs and ensuring the rationality of anomaly handling. Early detection of potential anomalies allows business risk assessment data to assist in determining whether business interruptions have occurred, reducing the decision-making difficulty and operational costs of anomaly handling, and ensuring stable business operation.

[0009] According to an embodiment of the present invention, the method further includes: using target historical resource data in the target dataset with a first similarity greater than a preset similarity threshold with the resource prediction information as the knowledge basis of the resource prediction information; and determining the credibility of the knowledge basis based on a second similarity between the knowledge basis and the resource prediction information and the historical prediction error of the target historical resource data.

[0010] According to embodiments of the present invention, based on the target dataset obtained by retrieval enhancement, knowledge basis and predictive resource scheduling information are output synchronously, which can improve the reliability of prediction and the executability of early warning, thereby realizing proactive prediction and accurate execution of resource scheduling. While improving resource utilization, it ensures stable business operation, reduces manual intervention, and better adapts to the large-scale, high-concurrency operation requirements of cloud platforms.

[0011] According to an embodiment of the present invention, the scheduling case data of historical similar services in the target dataset includes multiple scheduling methods and their respective execution effects. The statistical characteristics of the target dataset and the business association features extracted from the business attribute data of multiple services are processed by a prediction model to obtain predicted resource scheduling information. This includes: when the request type includes a scheduling scheme generation request, determining a target scheduling method from multiple scheduling methods based on real-time resource data from multiple nodes, resource prediction information, and the respective execution effects of the multiple scheduling methods; and generating a resource scheduling scheme based on the target scheduling method as the predicted resource scheduling information.

[0012] According to embodiments of the present invention, when selecting target scheduling methods, three types of core data are combined simultaneously—real-time resource data of multiple nodes, resource prediction information, and the execution effects of various scheduling methods—to avoid scheduling method selection bias caused by a single data dimension. A resource scheduling scheme is generated based on the selected target scheduling methods, giving the resource scheduling scheme a clear execution basis and improving the accuracy of the generated resource scheduling scheme.

[0013] At the same time, the resource scheduling plan is incorporated into the predicted resource scheduling information, realizing a complete closed loop of resource prediction, anomaly warning and scheduling plan, and upgrading the predicted resource scheduling information from predictive guidance to a feasible execution plan.

[0014] According to an embodiment of the present invention, the node includes physical machines. Generating a resource scheduling scheme based on a target scheduling method includes: verifying the target scheduling method based on the available resource capacity of multiple physical machines to obtain a verification result; if the verification result indicates a conflict between the target scheduling method and the available resource capacity of multiple physical machines, generating a resource scheduling scheme for expanding the capacity of multiple physical machines based on the target scheduling method; and if multiple physical machines do not meet the expansion conditions, generating a resource scheduling scheme that matches the available resource capacity of multiple physical machines using a prediction model.

[0015] According to embodiments of the present invention, when the target scheduling method is difficult to adapt to the current physical machine resources, resource conflicts can be resolved by expanding the capacity of physical machines to supplement resource supply, ensuring that the target scheduling method can be successfully implemented, while also taking into account the scalability of physical machine resources. When the physical machine cannot be expanded (e.g., due to hardware limitations), the original target scheduling method is not forcibly executed. Instead, the scheduling scheme is adjusted by combining a predictive model with the actual available resources of the physical machines to achieve resource-adaptive scheduling.

[0016] According to an embodiment of the present invention, the knowledge base includes a historical data layer, a real-time data layer, and a knowledge index layer. The historical data layer is used to store historical resource data, historical operation and maintenance data, historical anomaly handling data, and business attribute data of various services of the cloud platform. The real-time data layer is used to store real-time resource data via distributed system storage nodes, operation data of the operation and maintenance interface, and real-time business data collected from business systems. The knowledge index layer is used to construct historical data indexes based on the data in the historical data layer, construct real-time data indexes based on the data in the real-time data layer, and construct semantic indexes related to cloud platform resource scheduling. The historical data indexes, real-time data indexes, and semantic indexes are used for retrieval operations.

[0017] According to an embodiment of the present invention, by dividing the database into a real-time data layer, a historical data layer, and a knowledge index layer, the real-time data layer enables the cloud platform to quickly query and calculate the current node status and business indicators, ensuring low latency and high response speed for resource scheduling and anomaly detection. The historical data layer provides sufficient historical features and patterns for the resource prediction model, and enables data traceability, improving the reliability of prediction and decision-making. The knowledge index layer constructs a structured index for historical data, business attributes, scheduling cases, etc., enabling multi-dimensional rapid positioning and retrieval during the retrieval process, significantly improving knowledge matching efficiency and retrieval accuracy, and reducing data preparation overhead before prediction model inference.

[0018] According to an embodiment of the present invention, the method further includes: in response to receiving a real-time resource data sequence from a node, filling in missing values ​​in the real-time resource data sequence to obtain a filled real-time resource data sequence, wherein the real-time resource data sequence includes multiple real-time resource data arranged in chronological order; in the case of outliers in the real-time resource data sequence, obtaining replacement values ​​based on the resource peak values ​​and the weights of multiple reference points in the real-time resource data sequence, and updating the outliers to the replacement values ​​to obtain a preprocessed real-time resource data sequence, wherein the weights are determined based on the time interval between the reference points and the resource peak values; and synchronously writing the preprocessed real-time resource data sequence to the real-time data layer in real time, and asynchronously writing it to the historical data layer.

[0019] According to embodiments of the present invention, missing value filling ensures the accuracy of real-time data, and outlier handling removes abnormal data from real-time resource data caused by equipment failure, acquisition errors, or sudden interference, thus preventing abnormal data from interfering with subsequent analysis. The hierarchical storage method of "real-time synchronization + asynchronous writing" balances the timeliness of real-time data with the storage requirements of historical data, ensuring both the efficiency of real-time scheduling and anomaly early warning, and the reliability of resource prediction and historical analysis.

[0020] According to an embodiment of the present invention, determining at least one target data that matches a retrieval request in multiple dimensions from a data source, as a target dataset, includes: determining multiple retrieval data related to at least one of the numerical vector and semantic vector in the retrieval request from the data source through a knowledge indexing layer; and determining at least one target data from the multiple retrieval data based on the relevance between the multiple retrieval data and the retrieval request, respectively.

[0021] A second aspect of the present invention provides a resource scheduling apparatus, comprising: a first determining module, configured to, in response to receiving a retrieval request regarding cloud platform resource scheduling, determine a data source from a knowledge base that matches the request type of the retrieval request, wherein the cloud platform includes multiple nodes, the knowledge base includes multiple data sources that match the request type, and the data sources are used to store resource data, business operation data, and business attribute data of multiple services executable by the cloud platform for the corresponding multiple nodes, wherein the resource data represents the available resources of the nodes, and the business operation data represents the operation status of the services on the nodes; a second determining module, configured to determine at least one target data that matches the retrieval request in multiple dimensions from the data source, as a target dataset, wherein the multiple dimensions include at least numerical dimensions and semantic dimensions; and a processing module, configured to process the statistical features of the target dataset and the business association features extracted from the business attribute data of the multiple services using a prediction model to obtain predicted resource scheduling information, wherein the cloud platform uses the predicted resource scheduling information to perform resource scheduling on the multiple nodes.

[0022] A third aspect of the present invention provides an electronic device comprising: one or more processors; and a memory for storing one or more computer programs, wherein the one or more processors execute the one or more computer programs to implement the steps of the method described above.

[0023] A fourth aspect of the present invention also provides a computer-readable storage medium having a computer program or instructions stored thereon, wherein the computer program or instructions, when executed by a processor, implement the steps of the above-described method.

[0024] A fifth aspect of the present invention also provides a computer program product, including a computer program or instructions that, when executed by a processor, implement the steps of the above-described method.

[0025] According to an embodiment of the present invention, in response to receiving a retrieval request regarding cloud platform resource scheduling, a data source matching the request type of the retrieval request is determined from the knowledge base, accurately locking the data source matching the request type. The data source stores resource data, business operation data, and business attribute data of various services executable by the cloud platform for multiple corresponding nodes. At least one target data matching the retrieval request is determined from the data source, matching it at least in both numerical and semantic dimensions. This target dataset can accurately match hard constraints such as resource thresholds, load ranges, and hardware specifications using the numerical dimension, and achieve semantic similarity matching of scheduling cases, fluctuation patterns, and business requirements using the semantic dimension, significantly improving the accuracy and comprehensiveness of the retrieval results. Subsequently, a predictive model is used to process the statistical characteristics of the target dataset and the business correlation characteristics of various services to obtain predicted resource scheduling information. The predictive model can capture complex nonlinear change patterns of business load based on real, highly correlated business operation data and resource data, rather than relying on simplified assumptions or global average data, thereby enhancing the response capability to sudden, periodic, and other nonlinear fluctuations. Meanwhile, the independent forecasting of a single business is extended to joint modeling of multiple businesses, which effectively alleviates the load distortion problem caused by mutual interference or collaboration between businesses, reduces forecasting errors, avoids resource overload or idleness caused by inaccurate forecasting, and thus improves the overall business processing efficiency. Attached Figure Description

[0026] The above-described features, other objects, and advantages of the present invention will become clearer from the following description of embodiments of the invention with reference to the accompanying drawings, in which:

[0027] Figure 1 An application scenario diagram of the resource scheduling method according to an embodiment of the present invention is shown;

[0028] Figure 2 A flowchart of a resource scheduling method according to an embodiment of the present invention is shown;

[0029] Figure 3 A schematic diagram of multi-dimensional fusion retrieval according to an embodiment of the present invention is shown;

[0030] Figure 4 A flowchart of a resource scheduling method according to another embodiment of the present invention is shown;

[0031] Figure 5 A structural block diagram of a resource scheduling apparatus according to an embodiment of the present invention is shown;

[0032] Figure 6 A block diagram of an electronic device suitable for implementing a resource scheduling method according to an embodiment of the present invention is shown. Detailed Implementation

[0033] Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of the invention. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of the invention for ease of explanation. However, it will be apparent that one or more embodiments may be practiced without these specific details. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concept of the invention.

[0034] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.

[0035] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.

[0036] When using expressions such as "at least one of A, B and C", they should generally be interpreted in accordance with the meaning that is commonly understood by those skilled in the art (e.g., "a system having at least one of A, B and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B and C, etc.).

[0037] In the technical solution of this invention, the user information (including but not limited to user personal information, user image information, user device information, such as location information) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of related data all comply with relevant laws, regulations, and standards, take necessary confidentiality measures, do not violate public order and good morals, and provide corresponding operation entry points for users to choose to authorize or refuse.

[0038] With the rapid development of cloud computing technology, the types of businesses supported by cloud platforms are becoming increasingly complex (such as e-commerce transactions, financial data processing, and industrial IoT monitoring). The resource requirements of different businesses exhibit significant fluctuations (such as a surge in CPU usage during e-commerce promotional periods and changes in peak memory usage for offline computing at night). However, cloud platform resource scheduling often struggles to adapt to the non-linear fluctuations of business loads, resulting in insufficient prediction accuracy and low business processing efficiency.

[0039] In the scheduling process of related technologies, operations and maintenance personnel need to manually summarize reports of various business resources and determine the necessity of scheduling. This is not only inefficient (analysis of a single business takes more than 30 minutes), but also prone to errors due to human experience bias (such as misjudging low-load business as high-load, resulting in wasted expansion resources). Therefore, there is an urgent need for a method that can deeply analyze long-term business resource data, accurately predict demand, and work with operations and maintenance personnel to achieve efficient scheduling.

[0040] In view of this, embodiments of the present invention provide a resource scheduling method, comprising: responding to receiving a retrieval request regarding cloud platform resource scheduling, determining a data source matching the request type of the retrieval request from a knowledge base, wherein the cloud platform includes multiple nodes, the knowledge base includes multiple data sources matching the request type, the data sources are used to store resource data, business operation data, and business attribute data of multiple services executable by the cloud platform for the corresponding multiple nodes, the resource data representing the available resources of the nodes, and the business operation data representing the operation status of the services on the nodes; determining at least one target data matching the retrieval request in multiple dimensions from the data sources as a target dataset, wherein the multiple dimensions include at least numerical dimensions and semantic dimensions; processing the statistical features of the target dataset and the business association features extracted from the business attribute data of multiple services using a prediction model to obtain predicted resource scheduling information, and the cloud platform using the predicted resource scheduling information to perform resource scheduling on multiple nodes.

[0041] Figure 1 An application scenario diagram of the resource scheduling method according to an embodiment of the present invention is shown.

[0042] like Figure 1 As shown in the diagram, the application scenarios of the resource scheduling method include cloud platform 110, distributed system 120, and database 130.

[0043] The cloud platform 110 may include multiple nodes (nodes 1, 2, 3, and 4 in the diagram) and a management and control platform. Each node acts as a data producer, sending resource data and business operation data generated by multiple nodes to the distributed system 120. It should be noted that nodes can be divided into physical machine nodes and virtual machine nodes; physical machine nodes are physical servers that provide hardware resource support; virtual machine nodes are logically isolated virtual computing units obtained by virtualization technology based on physical machine nodes.

[0044] The distributed system 120 can act as a message flow middleware to uniformly cache and distribute data streams (such as real-time resource data and real-time business operation data) from multiple nodes, so that the cloud platform 110's management and control platform can consume the data.

[0045] Data consumed and cleaned by the cloud platform 110 is written to the real-time data layer and historical data layer of database 130 for resource statistics, prediction and scheduling decisions.

[0046] Figure 2 A flowchart of a resource scheduling method according to an embodiment of the present invention is shown.

[0047] like Figure 2 As shown, the resource scheduling method in this embodiment includes operations S210 to S230.

[0048] In operation S210, in response to receiving a retrieval request regarding cloud platform resource scheduling, a data source matching the request type of the retrieval request is determined from the knowledge base. The cloud platform includes multiple nodes, and the knowledge base includes various data sources matching the request type. The data sources are used to store resource data, business operation data, and business attribute data of various services that can be executed by the cloud platform for the corresponding multiple nodes. Resource data represents the available resources of the node, and business operation data represents the operation status of the services on the node.

[0049] According to an embodiment of the present invention, nodes can be divided into physical machine nodes and virtual machine nodes; physical machine nodes are physical servers that provide hardware resource support; virtual machine nodes are virtual computing units that are logically isolated from each other and are divided based on physical machine nodes through virtualization technology.

[0050] Cloud platform resource scheduling can be a management and control behavior that dynamically allocates, scales up or down, or migrates computing resources on each node in order to balance node load, improve resource utilization, ensure stable business operation, and reduce operation and maintenance costs.

[0051] According to an embodiment of the present invention, a retrieval request is a query instruction generated based on resource statistics, prediction, or scheduling requirements, which can be used to retrieve resource data, business data, and scheduling knowledge related to the current query instruction from the knowledge base.

[0052] The request type for a retrieval request can be categorized based on semantic differences. For example, request types can include scheduling scheme generation type, scheduling alert type, scheduling evaluation type, etc.

[0053] According to embodiments of the present invention, the knowledge base can be a distributed database, capable of offline batch updates and real-time updates of resource data and business operation data of the cloud platform.

[0054] According to embodiments of the present invention, the data source may further include anomaly handling data, scheduling cases, and cloud platform operation and maintenance data.

[0055] According to embodiments of the present invention, resource data may include time-series data of indicators such as node CPU (Central Processing Unit) utilization, memory usage, disk I / O (Input / Output) throughput, and network bandwidth utilization. Service operation data may include time-series data of indicators such as the number of concurrent service requests and transaction processing latency.

[0056] According to embodiments of the present invention, business attribute data represents characteristic data of business type, operational characteristics, and resource constraints. For example, business attribute data may include: business type, importance level, application to which the business belongs, business resource usage habits (such as CPU-intensive, memory-intensive, I / O interface-intensive), business expansion threshold, business-related nodes, relationships between different businesses (such as dependency relationships, mutual exclusion relationships, temporal coupling relationships), migration constraints, etc.

[0057] In operation S220, at least one target data matching the retrieval request in multiple dimensions is determined from the data source as the target dataset, wherein the multiple dimensions include at least numerical dimensions and semantic dimensions.

[0058] According to embodiments of the present invention, numerical dimensions can be used to characterize quantifiable numerical features such as resource indicators and operating parameters, thereby achieving matching between target data and screening intervals.

[0059] According to embodiments of the present invention, the semantic dimension can be used to characterize unstructured semantic features such as business scenarios, scheduling intentions, and case descriptions, so as to match the semantics of the target data with the retrieval request.

[0060] According to embodiments of the present invention, the multiple dimensions may also include time dimension, business attribute dimension, and operational status dimension, etc. The time dimension can be used for time-series feature matching, the business attribute dimension can be used for business feature differentiation, and the operational status dimension can be used for node status filtering.

[0061] For example, a Retrieval-Augmented Generation (RAG) model can be used to identify at least one target dataset from the data source that matches the retrieval request in multiple dimensions, as the target dataset.

[0062] For example, a knowledge base's indexing layer (such as semantic and numerical indexes) can be used to identify at least one target data point from the data source that matches the retrieval request across multiple dimensions. A semantic index can be a semantic vector based on business descriptions and scheduling requirements, such as "migration solution for insufficient memory in logistics tracking business." A numerical index can be a numerical vector based on "unique node identifier - unique business identifier - real-time indicator threshold," such as "physical machine identifier - e-commerce transaction identifier - CPU ≥ 80."

[0063] In operation S230, the statistical features of the target dataset and the business association features extracted from the business attribute data of various businesses are processed by the prediction model to obtain predictive resource scheduling information. The cloud platform uses the predictive resource scheduling information to schedule resources for multiple nodes.

[0064] According to embodiments of the present invention, the prediction model can be a large model based on the Transformer architecture, possessing the capabilities of natural language understanding, text generation, and data reasoning. A large model based on the Transformer architecture can simultaneously analyze long-sequence data and business-related features.

[0065] According to embodiments of the present invention, statistical features may include resource-related statistical features and business operation-related statistical features concentrated in the target dataset. Resource-related statistical features may include the mean, variance, and fluctuation (trend) of resource indicators over a period of time. Business operation-related statistical features may include the mean, variance, and fluctuation of business operation indicators over a period of time.

[0066] According to embodiments of the present invention, predictive resource scheduling information is forward-looking scheduling information obtained through predictive model reasoning based on resource-related statistical characteristics, business operation-related statistical characteristics, and business-related characteristics, used to guide cloud platform resource allocation. For example, predictive resource scheduling information may include "predicted resource load values ​​and resource usage trends for nodes over a future period," "predicted business resource demand results and early warning information for insufficient or overloaded resources," "virtual machine migration timing, migration targets, and recommended target physical machines," and "resource conflict prediction results and corresponding scheduling avoidance schemes," etc.

[0067] According to an embodiment of the present invention, the resource scheduling method may be executed by the management platform of a cloud platform.

[0068] According to an embodiment of the present invention, in response to receiving a retrieval request regarding cloud platform resource scheduling, a data source matching the request type of the retrieval request is determined from the knowledge base, accurately locking the data source matching the request type. The data source stores resource data, business operation data, and business attribute data of various services executable by the cloud platform for multiple corresponding nodes. At least one target data matching the retrieval request is determined from the data source, matching it at least in both numerical and semantic dimensions. This target dataset can accurately match hard constraints such as resource thresholds, load ranges, and hardware specifications using the numerical dimension, and achieve semantic similarity matching of scheduling cases, fluctuation patterns, and business requirements using the semantic dimension, significantly improving the accuracy and comprehensiveness of the retrieval results. Subsequently, a predictive model is used to process the statistical characteristics of the target dataset and the business correlation characteristics of various services to obtain predicted resource scheduling information. The predictive model can capture complex nonlinear change patterns of business load based on real, highly correlated business operation data and resource data, rather than relying on simplified assumptions or global average data, thereby enhancing the response capability to sudden, periodic, and other nonlinear fluctuations. Meanwhile, the independent forecasting of a single business is extended to joint modeling of multiple businesses, which effectively alleviates the load distortion problem caused by mutual interference or collaboration between businesses, reduces forecasting errors, avoids resource overload or idleness caused by inaccurate forecasting, and thus improves the overall business processing efficiency.

[0069] According to an embodiment of the present invention, a mapping relationship between the request type of the retrieval request and the data source is pre-configured. The request type of the retrieval request includes at least one of the following: resource prediction request, scheduling scheme generation request, and anomaly warning request. According to the mapping relationship, the data source matched with the resource prediction request includes: historical resource data and historical prediction errors of similar historical services, wherein the similar historical services are associated with the target service in the retrieval request. According to the mapping relationship, the data source matched with the scheduling scheme generation request includes: scheduling case data of similar historical services and real-time resource data of multiple nodes. According to the mapping relationship, the data source matched with the anomaly warning request includes: historical anomaly handling schemes and business risk assessment data of similar historical anomalies, wherein the similar historical anomalies are associated with the target anomaly in the retrieval request.

[0070] According to embodiments of this disclosure, the mapping relationship may include data sources corresponding to different request types.

[0071] According to embodiments of the present invention, historical prediction error can be the error between the prediction result and the actual result corresponding to historical resource data of similar historical services. Historical prediction error can reflect the accuracy and reliability of the prediction corresponding to historical resource data.

[0072] For example, the target business in a search request could be an e-commerce transaction, and similar historical businesses could be businesses related to e-commerce transactions.

[0073] According to embodiments of the present invention, real-time resource data from multiple nodes can reflect the real-time resource occupancy of each node on the cloud platform. Historical scheduling case data for similar services can serve as multiple reference cases, and based on real-time resource occupancy, a resource scheduling scheme conforming to the cloud platform can be generated.

[0074] For example, the target anomaly in the retrieval request could be "data loss due to node failure", and the historical anomaly handling scheme for similar historical anomalies could be "complete case regarding data loss due to node failure".

[0075] For example, business risk assessment data could be the risk value of business disruption.

[0076] According to embodiments of the present invention, different types of retrieval requests correspond to different retrieval data sources, which can improve retrieval efficiency.

[0077] According to an embodiment of the present invention, the statistical features include the statistical features of historical resource data and historical business operation data of historical similar businesses in the target dataset; processing the statistical features of the target dataset and the business association features extracted from the business attribute data of various businesses using a prediction model to obtain predictive resource scheduling information includes: when the request type includes a resource prediction request, processing the statistical features, business association features and real-time resource data using a prediction model to obtain resource prediction information as predictive resource scheduling information; when the request type includes an anomaly warning request, comparing the resource prediction value in the resource prediction information with a preset resource threshold to determine that the cloud platform is in an abnormal state; and generating an anomaly warning based on the historical anomaly handling schemes and business risk assessment data in the target dataset as predictive resource scheduling information.

[0078] According to embodiments of the present invention, a resource prediction request can typically be a prediction of the resources consumed by the cloud platform in performing business over a future time period. For example, a resource prediction request could be "the average daily resource consumption of each node on the cloud platform over the next 5 days".

[0079] Resource forecast information can include individual service operation forecasts and resource forecasts for multiple nodes. Resource forecasts can be used to determine whether anomalies will occur in the future. Service operation forecasts can help determine whether service interruptions will occur.

[0080] According to embodiments of the present invention, the prediction model understands and infers based on retrieved factual information (such as statistical features, business association features, and real-time resource data). During the inference process, the load distortion problem caused by the collaboration or mutual interference between multiple businesses is considered. The model also enhances the response capability to sudden tasks (such as temporary promotional activities in e-commerce businesses) through statistical features, thus obtaining accurate resource prediction information.

[0081] According to an embodiment of the present invention, an anomaly warning request can be a resource or service warning for a cloud platform in the future. For example, an anomaly warning request could be "whether resource scheduling anomalies or service anomalies will occur in the next month".

[0082] According to an embodiment of the present invention, the preset resource threshold can be the maximum resource capacity of a node. For example, the preset resource threshold can be 90% CPU utilization of a physical machine. The preset resource threshold can be set according to different resource indicators (such as 80% memory utilization) and is not specifically limited.

[0083] According to an embodiment of the present invention, the predicted resource value in the resource prediction information is compared with a preset resource threshold. If the predicted resource value is greater than the preset resource threshold, it is determined that the cloud platform is in an abnormal state. The business risk assessment data corresponding to the abnormal resource indicator is determined from the target dataset. Based on the historical anomaly handling schemes and business risk assessment data in the target dataset, an anomaly warning is generated as predictive resource scheduling information.

[0084] Anomaly alerts can include the cause of the anomaly and a recommended handling solution. For example, the cause of the anomaly could be "Memory usage will reach 92% at 8 PM on the 8th day in the future," and the recommended handling solution could be "Increase virtual memory capacity."

[0085] According to embodiments of the present invention, statistical features reflect historical operational patterns of resources, while business-related features reflect differences in business needs. The combination of these two features enables the predictive model to accurately capture resource change trends, avoiding the limitations of a single data dimension. This allows for accurate prediction of resource needs, providing a reliable basis for scheduling decisions, enabling the rational allocation of resources in advance, avoiding resource idleness or shortages, and improving resource utilization efficiency. Comparing the predicted resource values ​​in the resource prediction information with preset resource thresholds can quickly locate anomalies such as resource overload and shortage. Historical anomaly handling solutions provide mature references for handling, reducing trial-and-error costs and ensuring the rationality of anomaly handling. Early detection of potential anomalies allows business risk assessment data to assist in determining whether business interruptions have occurred, reducing the decision-making difficulty and operational costs of anomaly handling, and ensuring stable business operation.

[0086] According to an embodiment of the present invention, the method further includes: using target historical resource data in the target dataset with a first similarity greater than a preset similarity threshold with the resource prediction information as the knowledge basis of the resource prediction information; and determining the credibility of the knowledge basis based on a second similarity between the knowledge basis and the resource prediction information and the historical prediction error of the target historical resource data.

[0087] According to an embodiment of the present invention, the preset similarity threshold may be 90%, 92%, etc., without specific limitation. The preset similarity threshold may also be determined based on the maximum similarity between historical resource data and resource prediction information in the target dataset, and the historical resource data with the maximum similarity is used as the target historical resource data.

[0088] According to an embodiment of the present invention, the knowledge basis may be structured and unstructured knowledge such as historical operation data, statistical features, business attributes, historical exception handling solutions, and business risk assessment data relied on by the prediction model during resource prediction and anomaly warning. Its role is to provide real, traceable, and practical fact support for the cloud platform's actual operation situation for prediction and warning, avoiding decision-making biases caused by the model relying solely on experience or single data.

[0089] According to an embodiment of the present invention, the credibility of the knowledge basis is quantitatively calculated by weighted assignment of the second similarity and historical prediction error.

[0090] For example, the weight coefficient a of the second similarity and the weight coefficient b of the historical prediction error, where a + b = 1 (which can be dynamically adjusted according to the cloud platform scheduling priority); the calculation formula for the credibility C is: C = a×S+(1 - E)×b; where S is the second similarity, E is the historical prediction error, and (1 - E) represents the historical prediction accuracy, which is positively correlated with the credibility.

[0091] For example, according to the business priority in the business attribute data, different weight coefficients are set: for core businesses (such as financial transaction businesses), emphasizing the historical accuracy of the knowledge basis, setting the weight coefficient b corresponding to the historical prediction error to 0.6 and the weight coefficient a of the second similarity to 0.4; for non-core businesses (such as ordinary office businesses), emphasizing the correlation between the knowledge basis and the current prediction, setting a = 0.7 and b = 0.3. Using the calculation formula C0 = a×S+(1 - E)×b of the embodiment and combining the weights corresponding to the current business priority, the credibility of the knowledge basis is calculated.

[0092] For example, the credibility calculation of the knowledge basis can also be dynamically optimized through iterative correction of the historical prediction error to improve the determination accuracy. First, weighted assignment is performed on the second similarity and the historical prediction error (the weight coefficients of the historical prediction error and the second similarity are 0.5 respectively, that is, a = 0.5 and b = 0.5), and the initial credibility C0 of the knowledge basis is quantitatively calculated. Then, from the target historical resource data, historical prediction records related to the current knowledge basis within the past 1 month are screened out, and the average prediction error E during this time period is calculated avg If E avg < E (the current historical prediction error), it means that the accuracy of this knowledge basis has been improved recently, and the current historical prediction error is corrected. After correction, E’=(E + E avg) / 2; if E avg >E indicates that the accuracy of the knowledge base has decreased. After correction, E'=E to retain the current higher error and reduce the credibility weight.

[0093] Substitute the corrected E' into the formula to calculate the confidence level after iteration: C1 = a × S + (1 - E') × b. If the difference between C1 and C0 exceeds 0.1, repeat the above weighting magnitude and correction process until the difference between C1 and C0 is ≤ 0.1, and obtain the target confidence level C'.

[0094] Based on the target credibility C' and the accuracy requirements of the prediction scenario, the usability of the knowledge basis is determined: if C' ≥ 0.8, it is directly used as the core supporting basis; if 0.6 ≤ C' < 0.8, it is used as an auxiliary supporting basis and combined with other highly credible knowledge basis; if C' < 0.6, the knowledge basis is removed and the weight of this type of knowledge in the knowledge base is updated to avoid repeated use in the future.

[0095] For example, the knowledge basis label for predicting resource scheduling information could be "Knowledge basis: Memory peak in the same month over the past 3 years increased by 60%+, real-time memory increased by 8% in 1 hour, credibility 88%".

[0096] According to embodiments of the present invention, based on the target dataset obtained by retrieval enhancement, knowledge basis and predictive resource scheduling information are output synchronously, which can improve the reliability of prediction and the executability of early warning, thereby realizing proactive prediction and accurate execution of resource scheduling. While improving resource utilization, it ensures stable business operation, reduces manual intervention, and better adapts to the large-scale, high-concurrency operation requirements of cloud platforms.

[0097] According to an embodiment of the present invention, the scheduling case data of historical similar services in the target dataset includes multiple scheduling methods and their respective execution effects. The statistical characteristics of the target dataset and the business association features extracted from the business attribute data of multiple services are processed by a prediction model to obtain predicted resource scheduling information. This includes: when the request type includes a scheduling scheme generation request, determining a target scheduling method from multiple scheduling methods based on real-time resource data from multiple nodes, resource prediction information, and the respective execution effects of the multiple scheduling methods; and generating a resource scheduling scheme based on the target scheduling method as the predicted resource scheduling information.

[0098] For example, historical scheduling case data for similar services could be "In 2023, after business F was expanded by 1.5 times, the CPU utilization rate dropped to 60%", where the scheduling method is "expansion by 1.5 times" and the execution effect is "CPU utilization rate dropped to 60%".

[0099] The real-time resource data for a node could be "25% load on physical machine 15".

[0100] Based on resource forecasting information and real-time resource data from multiple nodes, the resource change trends of multiple nodes over a future period can be determined. From various scheduling methods, a target scheduling method that matches the node's resource change trend is selected. For example, if the resource change trend of a node is a 25% increase in the load of physical machine 15, the scheduling method matching this trend would be "expansion or migration".

[0101] By combining real-time resource data from multiple nodes, multi-node collaborative scheduling can be achieved, avoiding load imbalance caused by single-node scheduling; by combining resource prediction information, the scheduling scheme can be forward-looking, resource allocation can be planned in advance, and the limitations of passive scheduling can be broken; by combining the execution effects of various scheduling methods, the optimal or most suitable method can be selected, taking into account scheduling efficiency, resource utilization and business stability.

[0102] Leveraging the reasoning capabilities of predictive models, combined with differences in business needs reflected in business-related characteristics, real-time resource data from multiple nodes, and target scheduling methods, resource scheduling schemes are intelligently generated. These schemes are generated based on selected target scheduling methods, ensuring they have clear execution criteria rather than being generated blindly.

[0103] For example, resource scheduling schemes may include objects, dimensions, magnitudes, time windows (such as "business F expands CPU to 1.5 times from 2-3 am"), migration target nodes, and risk control (such as "back up data before migration, with interruption time ≤ 10 seconds").

[0104] According to embodiments of the present invention, when selecting target scheduling methods, three types of core data are combined simultaneously—real-time resource data of multiple nodes, resource prediction information, and the execution effects of various scheduling methods—to avoid scheduling method selection bias caused by a single data dimension. A resource scheduling scheme is generated based on the selected target scheduling methods, giving the resource scheduling scheme a clear execution basis and improving the accuracy of the generated resource scheduling scheme.

[0105] At the same time, the resource scheduling plan is incorporated into the predicted resource scheduling information, realizing a complete closed loop of resource prediction, anomaly warning and scheduling plan, and upgrading the predicted resource scheduling information from predictive guidance to a feasible execution plan.

[0106] According to an embodiment of the present invention, the node includes physical machines. Generating a resource scheduling scheme based on a target scheduling method includes: verifying the target scheduling method based on the available resource capacity of multiple physical machines to obtain a verification result; if the verification result indicates a conflict between the target scheduling method and the available resource capacity of multiple physical machines, generating a resource scheduling scheme for expanding the capacity of multiple physical machines based on the target scheduling method; and if multiple physical machines do not meet the expansion conditions, generating a resource scheduling scheme that matches the available resource capacity of multiple physical machines using a prediction model.

[0107] The target scheduling method could be "expand capacity by 1.5 times," but the available resource capacity of the physical machines is "only 6 cores remaining." Expanding to 6 cores by 1.5 times is difficult, therefore the verification result shows a conflict between the target scheduling method and the available resource capacity of multiple physical machines. For example, based on the target scheduling method "expand capacity by 1.5 times" and the current available resource capacity of the physical machines "only 6 cores remaining," the expansion requirement for the physical machines can be determined (e.g., deploying three idle physical machines to support services or upgrading the hardware of physical machines to increase single-machine resource capacity) to achieve "expand capacity by 1.5 times."

[0108] According to an embodiment of the present invention, multiple physical machines not meeting the expansion conditions may be due to the difficulty in expanding the capacity of multiple physical machines on the cloud platform (such as the cloud platform having no idle physical machines or the physical machines having difficulty increasing their single-machine resource capacity through hardware upgrades).

[0109] When multiple physical machines do not meet the expansion requirements, a predictive model is used to generate a resource scheduling scheme for virtual machines. For example, sufficient memory and computing resources are allocated to newly added virtual machines, which are then added to the service cluster of business u.

[0110] According to embodiments of the present invention, when the target scheduling method is difficult to adapt to the current physical machine resources, resource conflicts can be resolved by expanding the capacity of physical machines to supplement resource supply, ensuring that the target scheduling method can be successfully implemented, while also taking into account the scalability of physical machine resources. When the physical machine cannot be expanded (e.g., due to hardware limitations), the original target scheduling method is not forcibly executed. Instead, the scheduling scheme is adjusted by combining a predictive model with the actual available resources of the physical machines to achieve resource-adaptive scheduling.

[0111] According to an embodiment of the present invention, the knowledge base includes a historical data layer, a real-time data layer, and a knowledge index layer. The historical data layer is used to store historical resource data, historical operation and maintenance data, historical anomaly handling data, and business attribute data of various services of the cloud platform. The real-time data layer is used to store real-time resource data via distributed system storage nodes, operation data of the operation and maintenance interface, and real-time business data collected from business systems. The knowledge index layer is used to construct historical data indexes based on the data in the historical data layer, construct real-time data indexes based on the data in the real-time data layer, and construct semantic indexes related to cloud platform resource scheduling. The historical data indexes, real-time data indexes, and semantic indexes are used for retrieval operations.

[0112] According to an embodiment of the present invention, the knowledge base adopts a "layered storage + real-time synchronization" architecture, which is divided into a historical data layer, a real-time data layer and a knowledge index layer, and can provide data support for RAG retrieval and prediction models.

[0113] As shown in Table 1, the historical data layer is used to store historical resource data, historical business operation data, historical scheduling case data, historical operation and maintenance data obtained from operation and maintenance logs, historical anomaly handling data in the manually labeled case library, and business attribute data of various services.

[0114] The real-time data layer is used to store real-time resource data and real-time business operation data (collected every 5 minutes, in conjunction with node components) from the distributed system storage nodes, operation data from the operation and maintenance interface (such as manually adjusted scheduling schemes and execution result feedback), real-time business data collected from the business system (such as notifications of sudden business demands), and the real-time status of the cloud platform resource pool (number of idle physical machines, node load fluctuations, and network bandwidth usage).

[0115] For example, historical data indexes can be built as inverted indexes based on "business type + time period + resource indicators", such as "e-commerce transactions - 618 promotion - CPU", and the historical index is updated daily.

[0116] Real-time data indexes can be dynamically constructed based on "unique node identifier + unique business identifier + real-time indicator threshold", such as "physical machine 12 - e-commerce transaction - CPU ≥ 80%", and the real-time data index is updated in seconds.

[0117] Semantic indexes can be semantic vectors based on business descriptions and scheduling requirements, supporting natural language retrieval, such as "migration solutions for insufficient memory in logistics tracking business".

[0118] The real-time data layer adopts a dual mechanism of "stream processing + memory caching". That is, the distributed system (such as Kafka) receives the real-time data stream from the nodes. After the distributed system's stream processing framework cleans the data, it writes it to the node's memory database for RAG to retrieve quickly, and asynchronously synchronizes it to the historical data layer for archiving.

[0119] According to an embodiment of the present invention, by dividing the database into a real-time data layer, a historical data layer, and a knowledge index layer, the real-time data layer enables the cloud platform to quickly query and calculate the current node status and business indicators, ensuring low latency and high response speed for resource scheduling and anomaly detection. The historical data layer provides sufficient historical features and patterns for the resource prediction model, and enables data traceability, improving the reliability of prediction and decision-making. The knowledge index layer constructs a structured index for historical data, business attributes, scheduling cases, etc., enabling multi-dimensional rapid positioning and retrieval during the retrieval process, significantly improving knowledge matching efficiency and retrieval accuracy, and reducing data preparation overhead before prediction model inference.

[0120] Table 1

[0121]

[0122] According to an embodiment of the present invention, the method further includes: in response to receiving a real-time resource data sequence from a node, filling in missing values ​​in the real-time resource data sequence to obtain a filled real-time resource data sequence, wherein the real-time resource data sequence includes multiple real-time resource data arranged in chronological order; in the case of outliers in the real-time resource data sequence, obtaining replacement values ​​based on the resource peak values ​​and the weights of multiple reference points in the real-time resource data sequence, and updating the outliers to the replacement values ​​to obtain a preprocessed real-time resource data sequence, wherein the weights are determined based on the time interval between the reference points and the resource peak values; and synchronously writing the preprocessed real-time resource data sequence to the real-time data layer in real time, and asynchronously writing it to the historical data layer.

[0123] According to an embodiment of the present invention, if there are missing values ​​in the first real-time resource data subsequence that satisfies the first duration in the real-time resource data sequence, the missing values ​​are filled according to the mean of the first target data subsequence. The first target data subsequence and the first real-time resource data subsequence have the same duration and the same service.

[0124] The real-time resource data sequence can be divided according to a first duration, resulting in multiple first real-time resource data subsequences. For example, the first duration could be 30 minutes. It should be noted that there is no limit to the number of missing values ​​for the first duration; it can be one or more.

[0125] The first target data subsequence can be a data subsequence extracted from the historical resource data sequence of the same business within the same time period (e.g., for business F, the time period of the first real-time resource data subsequence can be from 13:00 to 13:30 on July 12, and the time period of the first target data subsequence can be from 13:00 to 13:30 on July 11); or a data subsequence extracted from the real-time resource data sequence within an adjacent time period (e.g., for business F, the time period of the first real-time resource data subsequence can be from 13:00 to 13:30 on July 12, and the time period of the first target data subsequence can be from 13:30 to 14:00 on July 12).

[0126] If a second real-time resource data subsequence satisfying the second duration in the real-time resource data sequence has a preset number of consecutive missing values, linear interpolation is performed on the second real-time resource data subsequence to obtain a filled real-time resource data subsequence. The filled real-time resource data subsequence has the same mean as the second target data subsequence, and the second target data subsequence has the same service as the real-time resource data sequence.

[0127] The real-time resource data sequence can be divided according to a second duration to obtain multiple second real-time resource data subsequences. For example, the second duration could be 1 hour and 30 minutes. It should be noted that the second real-time resource data subsequences contain a first preset number of consecutive missing values, allowing for linear interpolation. Otherwise, the missing value filling method of the first real-time resource data subsequence is used.

[0128] For example, the first preset number can be determined based on the amount of data that can be collected in the second duration and the amount of data that can be collected in the first duration. Typically, the preset number can be greater than the amount of data that can be collected in the first duration and less than the amount of data that can be collected in the second duration.

[0129] According to an embodiment of the present invention, the second target data subsequence may be a data subsequence of the same time period extracted from the historical resource data sequence of the same service; or a data subsequence of adjacent time periods extracted from the real-time resource data sequence.

[0130] If there are missing values ​​in the third real-time resource data subsequence that meets the third duration in the real-time resource data sequence, the third real-time resource data subsequence will be updated to the third target data subsequence. The third target data subsequence is a subsequence of the same time period extracted from the historical resource data sequence of the same business. The first duration is less than the second duration, and the second duration is less than the third duration.

[0131] According to an embodiment of the present invention, a real-time resource data sequence can be divided according to a third duration to obtain multiple third real-time resource data subsequences. For example, the third duration can be 2 hours. It should be noted that the third real-time resource data subsequence has a second preset number of consecutive missing values, which can update the third real-time resource data subsequence to a third target data subsequence. Otherwise, the missing value filling method of the second real-time resource subsequence or the first real-time resource data subsequence is used.

[0132] For example, the second preset number can be determined based on the amount of data that can be collected in the third time period and the amount of data that can be collected in the second time period. Typically, the preset number can be greater than the amount of data that can be collected in the second time period and less than the amount of data that can be collected in the third time period.

[0133] According to an embodiment of the present invention, when there are outliers in the real-time resource data sequence, replacement values ​​are obtained based on the resource peak values ​​and the weights of multiple reference points in the real-time resource data sequence, and the outliers are updated to the replacement values ​​to obtain a preprocessed real-time resource data sequence, wherein the weights are determined based on the time interval between the reference points and the resource peak values.

[0134] Outliers are identified by a combination of normal distribution indicators and business rules (non-normal distribution indicators). Instantaneous single-point anomalies are replaced by the "weighted average of adjacent reference points and resource peak values", while continuous multi-point anomalies are marked as "real business anomalies" and labeled separately.

[0135] For example, multiple reference points can be selected based on actual conditions without specific restrictions. For instance, if the time point of the outlier is 12:00 and the time point of the resource peak is 9:00, the time points of the multiple reference points can be 8:00, 10:00, 11:00, 13:00, etc.

[0136] The weights of multiple reference points can be determined based on the time interval between the reference point and the resource peak; for example, the smaller the time interval, the greater the weight allocation. A weighted average of the resource peak and multiple reference points can be taken to obtain the replacement value.

[0137] The preprocessing process may also include data normalization, mapping all resource data to the [0,1] interval, using the formula: X norm =(X X min ) / (X max X min (where X is resource data, X min X max These are the minimum and maximum values ​​of the resource indicator over the past year, respectively.

[0138] It should be noted that the historical resource data sequence, the real-time business operation data sequence, and the historical business operation data sequence are all preprocessed using the above method. The preprocessed data sequences are written to the real-time data layer in real time and to the historical data layer asynchronously.

[0139] Asynchronous writing is used to synchronously write to the real-time data layer and asynchronously write to the historical data layer.

[0140] According to embodiments of the present invention, missing value filling ensures the accuracy of real-time data, and outlier handling removes abnormal data from real-time resource data caused by equipment failure, acquisition errors, or sudden interference, thus preventing abnormal data from interfering with subsequent analysis. The hierarchical storage method of "real-time synchronization + asynchronous writing" balances the timeliness of real-time data with the storage requirements of historical data, ensuring both the efficiency of real-time scheduling and anomaly early warning, and the reliability of resource prediction and historical analysis.

[0141] According to an embodiment of the present invention, determining at least one target data that matches a retrieval request in multiple dimensions from a data source, as a target dataset, includes: determining multiple retrieval data related to at least one of the numerical vector and semantic vector in the retrieval request from the data source through a knowledge indexing layer; and determining at least one target data from the multiple retrieval data based on the relevance between the multiple retrieval data and the retrieval request, respectively.

[0142] The numerical vector in the retrieval request can be a quantifiable request retrieval value used to represent resource indicators, operating parameters, etc. For example, the numerical vector "8" can represent that the CPU utilization rate is 8%.

[0143] Semantic vectors in retrieval requests can be used to represent unstructured request retrieval semantics such as business scenarios, scheduling intentions, and case descriptions. For example, semantic vectors can represent business scenarios. The knowledge index layer supports "multi-dimensional fusion retrieval." Numerical index retrieval can be "CPU utilization 90%", while semantic index retrieval can be "business scenario" (such as a major promotional period) or "problem type" (such as expansion requirements).

[0144] The RAG module can simultaneously retrieve relevant knowledge based on "metric values" (such as CPU utilization of 90%), "business scenarios" (such as peak sales periods), and "problem types" (such as expansion requirements). The search results are sorted by "relevance score" (maximum score of 100, with the top 10 results used as target data for the prediction model).

[0145] According to embodiments of the present invention, the accuracy of structured numerical retrieval is balanced with the generalizability of unstructured semantic retrieval. It can accurately match hard constraints such as resource thresholds, load ranges, and hardware specifications, while also achieving semantic similarity matching of scheduling cases, fluctuation patterns, and business requirements, thus significantly improving the accuracy and comprehensiveness of retrieval results.

[0146] According to an embodiment of the present invention, the training dataset for the training process of the prediction model may include three years of historical business resource data from the cloud platform plus manually annotated high-confidence scheduling cases (≥90% matching degree) from the knowledge base. The loss function uses root mean square error (RMSE) to measure the deviation between the predicted value and the actual resource demand.

[0147] The incremental training process involves adding six months of business data and updating the knowledge base every six months to fine-tune the model and adapt it to changes in business patterns.

[0148] The RAG module is embedded between "predictive model training" and "resource prediction / scheduling scheme generation," forming a collaborative logic of "retrieval-enhancement-generation." The specific process is as follows: Before the predictive model performs resource prediction (e.g., at 1 AM daily), it automatically retrieves "historical similar scenario data of the business to be predicted" (e.g., when predicting the CPU demand of e-commerce on June 5th, it retrieves data and cases from June 1st to 10th of the past 3 years). Alternatively, the retrieval can be triggered in real time when the real-time data synchronization layer detects that "business indicators exceed thresholds" (e.g., CPU ≥ 85%) or when operations and maintenance personnel initiate a retrieval request.

[0149] Figure 3 A schematic diagram of multi-dimensional fusion retrieval according to an embodiment of the present invention is shown.

[0150] like Figure 3 As shown, in response to receiving a retrieval request 310 from the cloud platform resource scheduling, the request type of the retrieval request is determined.

[0151] The specific data source is determined based on the different request types. For example, in the case of a resource prediction request (320), the corresponding data source includes historical resource data and historical prediction errors from similar historical services (350). In the case of a scheduling scheme generation request (330), the corresponding data source includes scheduling case data from similar historical services and real-time resource data from multiple nodes (360). In the case of an anomaly warning request (340), the corresponding data source includes historical anomaly handling schemes from similar historical anomalies and business risk assessment data (370).

[0152] The knowledge index layer of the knowledge base is used to perform multi-dimensional fusion retrieval (numerical index and voice index) 380, and data with relevance greater than the preset relevance threshold is selected as target data 390.

[0153] The search results processing can be as follows: structured data is organized into a "case-effect" table, unstructured data is converted into a "problem-solution-effect" summary, and real-time data is labeled with "fluctuation trend" (e.g., "Business F's CPU has increased from 70% to 82% in the past hour").

[0154] According to an embodiment of the present invention, after scheduling execution, the actual effect (e.g., CPU utilization of 68% after business F expansion) and the RAG knowledge matching degree (e.g., a 3% deviation from historical cases, with a matching degree of 94%) are returned to the knowledge base. The knowledge base automatically updates the historical data layer case annotations (e.g., the 2025 618 business F expansion plan, with the effect meeting the target) and optimizes the knowledge index layer relevance algorithm. If the deviation is >15%, a manual annotation process is triggered, and the operation and maintenance personnel supplement the reason for the deviation (e.g., "the promotional efforts in 2025 exceeded expectations") and update it to the historical data layer.

[0155] Figure 4A flowchart of a resource scheduling method according to another embodiment of the present invention is shown.

[0156] like Figure 4 As shown, another embodiment of the resource scheduling method of the present invention includes steps S410 to S460.

[0157] When operating S410, in response to receiving a retrieval request regarding cloud platform resource scheduling, it determines a data source from the knowledge base that matches the request type of the retrieval request.

[0158] When operating the S420, the initial dataset is obtained by performing multi-dimensional fusion retrieval of the data source through the knowledge index layer.

[0159] In operation S430, based on the correlation between multiple retrieval data in the initial dataset and the retrieval request, at least one target data is determined from the multiple initial data as the target dataset.

[0160] When operating S440, if the request type includes a resource prediction request, the prediction model is used to process statistical features, business-related features, and real-time resource data to obtain resource prediction information, which is then used as predicted resource scheduling information.

[0161] When operating S450, if the request type includes an anomaly warning request, the resource prediction value in the resource prediction information is compared with the preset resource threshold to determine that the cloud platform is in an abnormal state; and based on the historical anomaly handling schemes and business risk assessment data in the target dataset, an anomaly warning is generated as predictive resource scheduling information.

[0162] When operating S460, if the request type includes a scheduling scheme generation request, the target scheduling method is determined from multiple scheduling methods based on the real-time resource data of multiple nodes, resource prediction information, and the execution effects of each scheduling method; a resource scheduling scheme is generated based on the target scheduling method as the predicted resource scheduling information.

[0163] According to an embodiment of the present invention, the operation and maintenance interface adds a "RAG search entry" and a "knowledge basis panel": operation and maintenance personnel can manually search for similar cases (such as "logistics tracking business migration plan") and view the historical data and real-time status of the plan reference (such as "referring to the 2024 Double 11 logistics migration to node 18, interruption for 8 seconds"). Operation and maintenance personnel can modify the plan based on the actual scenario (such as "business F has a promotion next week") or trigger secondary analysis of the prediction model (such as "re-predicting CPU demand during the promotion period").

[0164] According to an embodiment of the present invention, after the operations and maintenance personnel confirm the plan, the system sends the scheduling instruction to the cloud platform's dynamic resource scheduling module. This module calls the underlying interface to perform resource adjustment operations. After the operation is completed, the system provides feedback on the execution result to the operations and maintenance personnel (e.g., "Service F expansion successful, current CPU utilization 55%)" and sends the execution data back to the knowledge base and large model for subsequent optimization.

[0165] According to embodiments of the present invention, by retrieving historical similar scenarios and real-time trends using RAG, the error rate of resource demand prediction is reduced from 18% of the traditional method to below 5% (e.g., the CPU prediction error during e-commerce promotional periods is reduced from 7% to 4%), accurately capturing seasonal and cyclical fluctuations in business.

[0166] RAG allows for real-time retrieval of node resource status and compares its performance with similar cases, avoiding expansion failures due to insufficient real-time resources. The success rate of scheduling scheme execution has increased from 99% to 99.8%.

[0167] Based on accurate prediction of scaling up, down and migration operations, the overall resource utilization of the cloud platform is improved by 25%-30% (compared to about 50% utilization by traditional static scheduling), reducing resource redundancy.

[0168] According to the RAG-assisted traceability solution, the time for operation and maintenance personnel to confirm the solution has been reduced from 5 minutes to 2 minutes, the error rate has been reduced to below 0.5%, and the time for abnormal handling has been reduced from 30 minutes to 15 minutes.

[0169] Through the feedback of execution results and manual annotation, the coverage of knowledge base business scenarios has increased from 80% to over 95%, and the proportion of RAG search relevance scores ≥80 has increased from 85% to 92%.

[0170] The predictive model supports incremental training to adapt to new business types, and the scheduling scheme supports manual adjustment to meet personalized needs.

[0171] Figure 5 A structural block diagram of a resource scheduling device according to an embodiment of the present invention is shown.

[0172] like Figure 5 As shown, the resource scheduling device 500 of this embodiment includes a first determining module 510, a second determining module 520, and a processing module 530.

[0173] The first determining module 510 is used to respond to a received retrieval request regarding cloud platform resource scheduling by determining a data source from the knowledge base that matches the request type of the retrieval request. The cloud platform includes multiple nodes, and the knowledge base includes various data sources that match the request type. The data sources store resource data, business operation data, and business attribute data of various services executable by the cloud platform for the corresponding multiple nodes. Resource data represents the available resources of a node, and business operation data represents the operational status of services on the node. In one embodiment, the first determining module 510 can be used to execute the operation S210 described above, which will not be repeated here.

[0174] The second determining module 520 is used to determine at least one target data from the data source that matches the retrieval request in multiple dimensions, as a target dataset, wherein the multiple dimensions include at least numerical dimensions and semantic dimensions. In one embodiment, the second determining module 520 can be used to perform the operation S220 described above, which will not be repeated here.

[0175] The processing module 530 is used to process the statistical features of the target dataset and the business association features extracted from the business attribute data of various businesses using a predictive model to obtain predictive resource scheduling information. The cloud platform uses the predictive resource scheduling information to schedule resources for multiple nodes. In one embodiment, the processing module 530 can be used to execute the operation S230 described above, which will not be repeated here.

[0176] According to an embodiment of the present invention, a mapping relationship between the request type of the retrieval request and the data source is pre-configured. The request type of the retrieval request includes at least one of the following: resource prediction request, scheduling scheme generation request, and anomaly warning request. Based on the mapping relationship, the data source matching the resource prediction request includes: historical resource data and historical prediction errors of similar historical services, wherein the similar historical services are associated with the target service in the retrieval request. Based on the mapping relationship, the data source matching the scheduling scheme generation request includes: scheduling case data of similar historical services and real-time resource data of multiple nodes. Based on the mapping relationship, the data source matching the anomaly warning request includes: historical anomaly handling schemes and business risk assessment data of similar historical anomalies, wherein the similar historical anomalies are associated with the target anomaly in the retrieval request. According to an embodiment of the present invention, the statistical features include the statistical features of historical resource data and historical business operation data of similar historical services in the target dataset. The processing module 530 includes a first processing submodule, a comparison submodule, and a first generation submodule. The first processing submodule is used to process statistical features, business-related features, and real-time resource data using a prediction model when the request type includes a resource prediction request, to obtain resource prediction information as predicted resource scheduling information; the comparison submodule is used to compare the resource prediction value in the resource prediction information with a preset resource threshold when the request type includes an anomaly warning request, to determine that the cloud platform is in an abnormal state; the first generation submodule is used to generate an anomaly warning based on the historical anomaly handling schemes and business risk assessment data in the target dataset, as predicted resource scheduling information.

[0177] According to an embodiment of the present invention, the above-described apparatus further includes a third determining module and a fourth determining module. The third determining module is used to take target historical resource data in the target dataset whose first similarity with the resource prediction information is greater than a preset similarity threshold as the knowledge basis of the resource prediction information; the fourth determining module is used to determine the credibility of the knowledge basis based on the second similarity between the knowledge basis and the resource prediction information and the historical prediction error of the target historical resource data.

[0178] According to an embodiment of the present invention, the scheduling case data of historical similar services in the target dataset includes multiple scheduling methods and their respective execution effects; the processing module 530 includes a first determining submodule and a second generating submodule. The first determining submodule is used to determine the target scheduling method from multiple scheduling methods based on real-time resource data of multiple nodes, resource prediction information, and the respective execution effects of the multiple scheduling methods when the request type includes a scheduling scheme generation request; the second generating submodule is used to generate a resource scheduling scheme based on the target scheduling method, as predicted resource scheduling information.

[0179] According to an embodiment of the present invention, the node includes physical machines, and the second generation submodule includes a verification unit, a first generation unit, and a second generation unit. The verification unit is used to verify the target scheduling method based on the available resource capacity of multiple physical machines and obtain a verification result; the first generation unit is used to generate a resource scheduling scheme for expanding multiple physical machines based on the target scheduling method when the verification result indicates a conflict between the target scheduling method and the available resource capacity of multiple physical machines; the second generation unit is used to generate a resource scheduling scheme that matches the available resource capacity of multiple physical machines using a prediction model when multiple physical machines do not meet the expansion conditions.

[0180] According to an embodiment of the present invention, the knowledge base includes a historical data layer, a real-time data layer, and a knowledge index layer. The historical data layer is used to store historical resource data, historical operation and maintenance data, historical anomaly handling data, and business attribute data of various services of the cloud platform. The real-time data layer is used to store real-time resource data via distributed system storage nodes, operation data of the operation and maintenance interface, and real-time business data collected from business systems. The knowledge index layer is used to construct historical data indexes based on the data in the historical data layer, construct real-time data indexes based on the data in the real-time data layer, and construct semantic indexes related to cloud platform resource scheduling. The historical data indexes, real-time data indexes, and semantic indexes are used for retrieval operations.

[0181] According to an embodiment of the present invention, the above further includes a filling module, an updating module, and a writing module. The filling module is used to fill missing values ​​in the real-time resource data sequence received from the node in response to receive the real-time resource data sequence, resulting in a filled real-time resource data sequence, wherein the real-time resource data sequence includes multiple real-time resource data arranged in chronological order; the updating module is used to, in the case of outliers in the real-time resource data sequence, obtain replacement values ​​based on the resource peak values ​​and the weights of multiple reference points in the real-time resource data sequence, and update the outliers with replacement values, resulting in a preprocessed real-time resource data sequence, wherein the weights are determined based on the time interval between the reference points and the resource peak values; the writing module is used to synchronously write the preprocessed real-time resource data sequence to the real-time data layer in real time, and asynchronously write it to the historical data layer.

[0182] According to an embodiment of the present invention, the second determining module 520 includes a multi-dimensional fusion retrieval submodule and a second determining submodule. The multi-dimensional fusion retrieval submodule is used to determine multiple retrieval data related to at least one of the numerical vectors and semantic vectors in the retrieval request from the data source through a knowledge index layer; the second determining submodule is used to determine at least one target data from the multiple retrieval data based on the relevance between the multiple retrieval data and the retrieval request respectively.

[0183] According to embodiments of the present invention, any plurality of modules among the first determining module 510, the second determining module 520, and the processing module 530 may be combined into one module, or any one of these modules may be split into multiple modules. Alternatively, at least a portion of the functionality of one or more of these modules may be combined with at least a portion of the functionality of other modules and implemented in one module. According to embodiments of the present invention, at least one of the first determining module 510, the second determining module 520, and the processing module 530 may be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging circuitry, or implemented in software, hardware, or firmware, or in any suitable combination of any of these three implementation methods. Alternatively, at least one of the first determining module 510, the second determining module 520, and the processing module 530 may be at least partially implemented as a computer program module, which, when run, can perform corresponding functions.

[0184] Figure 6 A block diagram of an electronic device suitable for implementing a resource scheduling method according to an embodiment of the present invention is shown.

[0185] like Figure 6 As shown, an electronic device 600 according to an embodiment of the present invention includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 602 or a program loaded from a storage portion 608 into a random access memory (RAM) 603. The processor 601 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 601 may also include onboard memory for caching purposes. The processor 601 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of the present invention.

[0186] RAM 603 stores various programs and data required for the operation of electronic device 600. Processor 601, ROM 602, and RAM 603 are interconnected via bus 604. Processor 601 executes various operations of the method flow according to embodiments of the present invention by executing programs in ROM 602 and / or RAM 603. It should be noted that the programs may also be stored in one or more memories other than ROM 602 and RAM 603. Processor 601 may also execute various operations of the method flow according to embodiments of the present invention by executing programs stored in said one or more memories.

[0187] According to an embodiment of the present invention, the electronic device 600 may further include an input / output (I / O) interface 605, which is also connected to a bus 604. The electronic device 600 may also include one or more of the following components connected to the input / output (I / O) interface 605: an input section 606 including a keyboard, mouse, etc.; an output section 607 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc.; and a communication section 609 including a network interface card such as a LAN card, modem, etc. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the input / output (I / O) interface 605 as needed. A removable medium 611, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 610 as needed so that computer programs read from it can be installed into the storage section 608 as needed.

[0188] The present invention also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs, which, when executed, implement the method according to the embodiments of the present invention.

[0189] According to embodiments of the present invention, a computer-readable storage medium may be a non-volatile computer-readable storage medium, such as including, but not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the present invention, a computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. For example, according to embodiments of the present invention, a computer-readable storage medium may include ROM 602 and / or RAM 603 and / or one or more memories other than ROM 602 and RAM 603 described above.

[0190] Embodiments of the present invention also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowchart. When the computer program product is run on a computer system, the program code is used to enable the computer system to implement the resource scheduling method provided in the embodiments of the present invention.

[0191] When the computer program is executed by the processor 601, it performs the functions defined in the system / apparatus of this invention. According to embodiments of the invention, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0192] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and distributed in the form of signals over a network medium, and downloaded and installed via the communication section 609, and / or installed from the removable medium 611. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.

[0193] In such an embodiment, the computer program can be downloaded and installed from a network via the communication section 609, and / or installed from the removable medium 611. When the computer program is executed by the processor 601, it performs the functions defined in the system of this embodiment of the invention. According to embodiments of the invention, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0194] According to embodiments of the present invention, program code for executing the computer programs provided in the embodiments of the present invention can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages ​​include, but are not limited to, languages ​​such as Java, C++, Python, "C", or similar programming languages. The program code can be executed entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).

[0195] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0196] Those skilled in the art will understand that the features described in the various embodiments of the present invention can be combined and / or combined in various ways, even if such combinations or combinations are not explicitly described in the present invention. In particular, the features described in the various embodiments of the present invention can be combined and / or combined in various ways without departing from the spirit and teachings of the present invention. All such combinations and / or combinations fall within the scope of the present invention.

[0197] The embodiments of the present invention have been described above. However, these embodiments are merely illustrative and not intended to limit the scope of the invention. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of the invention, and all such substitutions and modifications should fall within the scope of the invention.

Claims

1. A resource scheduling method, characterized in that, The method includes: In response to receiving a retrieval request regarding cloud platform resource scheduling, a data source matching the request type of the retrieval request is determined from the knowledge base. The cloud platform includes multiple nodes, and the knowledge base includes multiple data sources matching the request type. The data sources are used to store resource data, business operation data, and business attribute data of multiple services executable by the cloud platform for the corresponding multiple nodes. The resource data represents the available resources of the node, and the business operation data represents the operation status of the service on the node. At least one target data that matches the retrieval request in multiple dimensions is determined from the data source as a target dataset, wherein the multiple dimensions include at least a numerical dimension and a semantic dimension; The cloud platform uses a predictive model to process the statistical features of the target dataset and the business association features extracted from the business attribute data of the various businesses to obtain predictive resource scheduling information. The cloud platform then uses the predictive resource scheduling information to schedule resources for the multiple nodes.

2. The method according to claim 1, characterized in that, The mapping relationship between the request type of the retrieval request and the data source is pre-configured. The request type of the retrieval request includes at least one of the following: resource prediction request, scheduling scheme generation request, and anomaly warning request. Based on the mapping relationship, the data sources that match the resource prediction request include: historical resource data and historical prediction errors of historical similar services, wherein the historical similar services are associated with the target service in the retrieval request; Based on the mapping relationship, the data source for generating the request with the scheduling scheme includes: scheduling case data of similar historical services and real-time resource data of multiple nodes; Based on the mapping relationship, the data sources that match the anomaly warning request include: historical anomaly handling schemes and business risk assessment data for historical similar anomalies, wherein the historical similar anomalies are associated with the target anomaly in the retrieval request.

3. The method according to claim 2, characterized in that, The statistical features include the statistical features of historical resource data and historical business operation data of historical similar businesses in the target dataset; By processing the statistical features of the target dataset using a predictive model, and extracting business-related features from the business attribute data of the various services, predictive resource scheduling information is obtained, including: When the request type includes the resource prediction request, the prediction model is used to process the statistical features, the business association features, and the real-time resource data to obtain resource prediction information, which is used as the predicted resource scheduling information. When the request type includes the anomaly warning request, the resource prediction value in the resource prediction information is compared with the preset resource threshold to determine that the cloud platform is in an abnormal state; and an anomaly warning is generated based on the historical anomaly handling schemes and business risk assessment data in the target dataset, which serves as the predicted resource scheduling information.

4. The method according to claim 3, characterized in that, The method further includes: Historical resource data in the target dataset with a first similarity greater than a preset similarity threshold to the resource prediction information are used as the knowledge basis for the resource prediction information. The credibility of the knowledge basis is determined based on the second similarity between the knowledge basis and the resource prediction information and the historical prediction error of the target historical resource data.

5. The method according to claim 3, characterized in that, The target dataset contains historical scheduling case data for similar services, including various scheduling methods and their respective execution effects. The process of using a predictive model to process the statistical features of the target dataset and the business-related features extracted from the business attribute data of the various services to obtain predictive resource scheduling information includes: When the request type includes the scheduling scheme generation request, the target scheduling method is determined from the multiple scheduling methods based on the real-time resource data of the multiple nodes, the resource prediction information, and the execution effect corresponding to each of the multiple scheduling methods. A resource scheduling scheme is generated based on the target scheduling method, which serves as the predicted resource scheduling information.

6. The method according to claim 5, characterized in that, The node includes a physical machine, and generating the resource scheduling scheme based on the target scheduling method includes: The target scheduling method is verified based on the available resource capacity of multiple physical machines to obtain the verification result; If the verification result indicates that the target scheduling method conflicts with the available resource capacity of multiple physical machines, a resource scheduling scheme for expanding the capacity of multiple physical machines is generated based on the target scheduling method. If multiple physical machines do not meet the expansion conditions, the prediction model is used to generate a resource scheduling scheme that matches the available resource capacity of the multiple physical machines.

7. The method according to claim 1, characterized in that, The knowledge base includes a historical data layer, a real-time data layer, and a knowledge index layer. The historical data layer is used to store the cloud platform’s historical resource data, historical operation and maintenance data, historical anomaly handling data, and business attribute data of the various services. The real-time data layer is used to store the real-time resource data of the nodes, the operation data of the operation and maintenance interface, and the real-time business data collected from the business system via a distributed system. The knowledge index layer is used to construct a historical data index based on the data in the historical data layer, a real-time data index based on the data in the real-time data layer, and a semantic index related to cloud platform resource scheduling. The historical data index, real-time data index, and semantic index are used for retrieval operations.

8. The method according to claim 7, characterized in that, The method further includes: In response to receiving the real-time resource data sequence from the node, missing values ​​are filled into the real-time resource data sequence to obtain a filled real-time resource data sequence, wherein the real-time resource data sequence includes multiple real-time resource data arranged in chronological order; In the event of outliers in the real-time resource data sequence, a replacement value is obtained based on the resource peak value and the weights of multiple reference points in the real-time resource data sequence, and the outlier is updated with the replacement value to obtain a preprocessed real-time resource data sequence. The weights are determined based on the time interval between the reference points and the resource peak value. The preprocessed real-time resource data sequence is synchronously written to the real-time data layer in real time and asynchronously written to the historical data layer.

9. The method according to claim 7, characterized in that, The step of determining at least one target data from the data source that matches the retrieval request in multiple dimensions, as a target dataset, includes: Through the knowledge indexing layer, multiple retrieval data related to at least one of the numerical vectors and semantic vectors in the retrieval request are determined from the data source; Based on the correlation between the multiple search data and the search request, at least one target data is determined from the multiple search data.

10. An electronic device, comprising: One or more processors; Memory, used to store one or more computer programs. The characteristic feature is that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1 to 9.