A big data scheduling method based on digital twinning
By constructing a scheduling evolution structure diagram based on digital twins, the problem of identifying non-retrograde scheduling paths was solved, enabling proactive risk avoidance in the scheduling process and improving the stability and reliability of big data scheduling.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING GUFENG TECHNOLOGY CO LTD
- Filing Date
- 2026-03-04
- Publication Date
- 2026-06-12
AI Technical Summary
Existing big data scheduling methods struggle to identify whether a scheduling path will enter an irreversible scheduling structure in complex business scenarios, resulting in insufficient stability and reliability of scheduling results and a high risk of long-term instability or resource conflicts.
By constructing a scheduling evolution structure graph using digital twin technology, the scheduling state is mapped as a structure node, the transition relationship is mapped as a structure edge, and constraint parameters are configured to generate an irreversible scheduling structure decision quantity, thereby realizing the pre-identification and avoidance of scheduling paths.
Identifying and avoiding non-reversible paths before scheduling execution reduces the risk of scheduling failure and system instability, improves scheduling stability and reliability, and ensures the controllability and sustainability of the scheduling process.
Smart Images

Figure CN122197316A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of big data scheduling, and in particular to a big data scheduling method based on digital twins. Background Technology
[0002] In complex business scenarios, big data scheduling methods typically model the attribute information of tasks and resources, combine them with established scheduling rules or optimization algorithms, and generate scheduling paths or scheduling instructions to coordinate resource allocation and task execution. In existing technologies, some solutions introduce historical scheduling data or real-time running data to dynamically adjust the scheduling results, while others describe the evolution of the scheduling process by constructing state diagrams or flowcharts, thereby improving scheduling efficiency and system responsiveness to a certain extent.
[0003] Existing technologies generally focus on analyzing the current scheduling state or local optimization objectives, lacking the characterization of the overall structural characteristics of the scheduling state during its evolution. In particular, it is difficult to identify whether the scheduling path will enter an irreversible scheduling structure before scheduling execution. Once the scheduling process crosses the critical constraint boundary, it can often only be passively corrected after the problem occurs, which can easily lead to long-term instability or resource conflict risks, affecting the stability and reliability of the scheduling results. Summary of the Invention
[0004] One objective of this invention is to propose a big data scheduling method based on digital twins. This invention uses digital twins and scheduling evolution structure modeling to achieve advance identification and avoidance of the risk of irreversible scheduling paths, thereby improving scheduling stability and reliability.
[0005] A big data scheduling method based on digital twins according to an embodiment of the present invention includes the following steps: Collect multi-source scheduling data and preprocess it to generate a multi-source scheduling dataset. Based on the scheduling multi-source dataset, digital twin mapping is performed on task entities and resource entities to determine the corresponding twin object identifiers and their associations, thereby generating a scheduling digital twin. In the scheduling digital twin, the state of the twin object is encoded according to the time index, and historical scheduling trajectory data and real-time operation data are mapped into scheduling state sequences. A set of candidate scheduling paths is generated based on the scheduling state sequences. A scheduling evolution structure is constructed based on the candidate scheduling path set, scheduling states are mapped to structure nodes, state transition relationships are mapped to structure edges, and structure edge constraint parameters are configured based on constraint data to generate a scheduling evolution structure graph. Based on the scheduling evolution structure diagram, the non-retrograde scheduling structure decision quantity is calculated for each candidate scheduling path in the candidate scheduling path set to determine the non-retrograde boundary. Based on the non-retrograde boundary, the non-retrogradeness of the candidate scheduling path is determined, and the path selection result is generated. Based on the path filtering results, inoperable paths are eliminated, the target scheduling path is determined, and scheduling execution instructions are generated.
[0006] Optionally, the multi-source scheduling data specifically includes task identification data, task attribute data, resource identification data, resource attribute data, task-resource association data, scheduling constraint data, historical scheduling trajectory data, and real-time operation data. The preprocessing specifically includes time alignment, format standardization, data cleaning, and association integration processing.
[0007] Optionally, the generation of the scheduling digital twin specifically includes: Based on the scheduling multi-source dataset, the task-related data is parsed to extract the corresponding task entity set from the scheduling multi-source dataset; Based on the task entity set, and according to the task identifier field in the scheduling multi-source data, a unique task twin object identifier is assigned to each task entity, and task twin object status information is generated based on the data field corresponding to the task entity. Based on the scheduling multi-source dataset, the resource-related data in it is parsed, and the corresponding set of resource entities in the scheduling multi-source dataset is extracted; Based on the set of resource entities, a unique resource twin object identifier is assigned to each resource entity according to the resource identifier field in the scheduling multi-source data, and the status information of the resource twin object is generated based on the data field corresponding to the resource entity. Based on the association records between task entities and resource entities in the multi-source scheduling dataset, the twin object association relationship between task twin object identifiers and resource twin object identifiers is determined, and corresponding association parameter information is configured for each twin object association relationship; The task twin object identifier and its corresponding task twin object status information, the resource twin object identifier and its corresponding resource twin object status information, and the twin object associations are uniformly organized to generate a scheduling digital twin.
[0008] Optionally, the generation of the candidate scheduling path set specifically includes: Based on the scheduling digital twin, a set of time indices is determined that are aligned with the timestamps of historical scheduling trajectory data and real-time operation data. This set of time indices is then used as the time reference for the generation of subsequent scheduling states. Under each time index, read the status information of the task twin object corresponding to each task twin object identifier and the status information of the resource twin object corresponding to each resource twin object identifier in the scheduling digital twin, combine them according to the unified status coding rules, and generate the twin object status code under the corresponding time index. Align and fuse historical scheduling trajectory data and real-time operation data records under the corresponding time index with twin object status codes to generate scheduling status under the corresponding time index; The scheduling states generated under each time index are arranged in chronological order to form a scheduling state sequence. Based on the scheduling state sequence, the scheduling states in the scheduling state sequence are combined according to the time continuity rule to generate multiple candidate scheduling paths composed of continuous scheduling states. The generated candidate scheduling paths are uniformly collected and identified to form a candidate scheduling path set.
[0009] Optionally, the generation of the scheduling evolution structure graph specifically includes: For the candidate scheduling path set, the scheduling states arranged in chronological order in each candidate scheduling path are analyzed, and all scheduling states appearing in the candidate scheduling path set are collected to generate a scheduling state set. According to the scheduling states in the scheduling state set, assign a unique structure node identifier to each scheduling state, and map each scheduling state to the corresponding structure node to form a structure node set. Based on the sequential relationship between adjacent scheduling states in the candidate scheduling path set, the state transition relationship between scheduling states is extracted, and each state transition relationship is mapped to a structural edge connecting the corresponding structural node to generate a set of structural edges. For each structural edge in the set of structural edges, determine the starting structural node and the target structural node according to the time order of the scheduling status in the candidate scheduling path, and complete the structural edge direction identification. By combining the constraint fields related to scheduling state transitions in the constraint data, the corresponding structural edge constraint parameters are configured for the state transition relationship corresponding to each structural edge. The set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge are organized in a unified manner to generate a scheduling evolution structure diagram.
[0010] Optionally, the generation of the path filtering results specifically includes: Extract the set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge from the scheduling evolution structure graph, and map each candidate scheduling path in the candidate scheduling path set into a path structure representation composed of continuous structural nodes and adjacent structural edges. For each path structure representation, the corresponding structural edge constraint parameters are extracted one by one according to the order of the structural edges in the path structure representation, and the path constraint parameter set corresponding to the candidate scheduling path is generated. Based on the set of path constraint parameters, the constraint satisfaction of each structural edge in the candidate scheduling path is collected to generate an irreversible scheduling structure decision quantity that represents the overall cumulative constraint state of the candidate scheduling path. The non-revertible scheduling structure decision quantities corresponding to each candidate scheduling path in the candidate scheduling path set are uniformly collected to determine the non-revertible boundary used to distinguish between revertible and non-revertible paths. Based on the non-reversible boundary, perform non-reversible determination on each candidate scheduling path in the candidate scheduling path set and generate the corresponding non-reversible determination flag; Collect the non-reversibility judgment flags corresponding to each candidate scheduling path and generate path selection results.
[0011] Optionally, the generation of the scheduling execution instruction specifically includes: Read the path filtering results and obtain the executability judgment flags corresponding to each candidate scheduling path in the candidate scheduling path set; Based on the executability determination flag, inexecutable paths are filtered out from the candidate scheduling path set, the remaining paths are collected, and an executable scheduling path set is generated. For the set of executable scheduling paths, each executable scheduling path is parsed according to the time index order of the scheduling state, and the corresponding scheduling state sequence is extracted. In the set of executable scheduling paths, each executable scheduling path is compared according to a preset path selection rule to determine the target scheduling path; Along the target scheduling path, the scheduling state sequence is used to perform state difference analysis on adjacent scheduling states, extract the change items of task twin object state information and resource twin object state information, and generate a set of scheduling actions. The set of scheduling actions is organized in time index order to generate scheduling execution instructions.
[0012] The beneficial effects of this invention are: This invention performs unified preprocessing on multi-source scheduling data and constructs a scheduling digital twin oriented towards task entities and resource entities. This maps the scattered and fragmented state information in the traditional scheduling process into a continuously evolving twin object state, transforming the scheduling process from static decision-making into a traceable and analyzable dynamic evolution process. By aligning historical scheduling trajectory data with real-time running data through time indexing and state encoding in the digital twin, this invention can characterize the changing relationships of scheduling states within the same temporal framework, thereby providing a unified and stable data foundation for subsequent scheduling path generation and structural analysis.
[0013] This invention constructs a scheduling evolution structure graph based on a set of candidate scheduling paths, mapping scheduling states to structure nodes and state transition relationships to structure edges. It also explicitly configures constraint information related to scheduling state transitions as structure edge constraint parameters. This allows the scheduling evolution structure to not only reflect the topological relationships between scheduling states but also to carry scheduling constraint semantics. Through this structured expression, this invention can perform an overall structural analysis of candidate scheduling paths before scheduling execution, rather than relying solely on local states or single indicators for judgment, thereby significantly improving the ability to perceive potential risks in the scheduling process.
[0014] This invention calculates the cumulative constraint state of candidate scheduling paths in the scheduling evolution structure diagram to generate a non-reversible scheduling structure decision quantity, and determines the non-reversible boundary accordingly, thus effectively distinguishing between reversible and non-reversible paths. This mechanism enables the scheduling system to identify scheduling paths that may lead to long-term instability or resource conflicts before scheduling execution, and to proactively avoid inoperable paths during the path selection stage, reducing the risk of scheduling failure and system instability from the source. Finally, by differentially generating a set of scheduling actions and forming scheduling execution instructions along the target scheduling path and the scheduling state sequence, this invention realizes a closed-loop process in which scheduling execution is directly driven by a risk-controllable scheduling structure, thereby improving the stability, reliability, and sustainable operation capability of scheduling decisions in complex scheduling scenarios. Attached Figure Description
[0015] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Fig. 1 This is a flowchart of a big data scheduling method based on digital twins proposed in this invention; Fig. 2 This is a schematic diagram of the scheduling evolution structure of a big data scheduling method based on digital twins proposed in this invention; Fig. 3 This is a schematic diagram illustrating the non-reversible determination and path selection of a big data scheduling method based on digital twins proposed in this invention. Detailed Implementation
[0016] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.
[0017] refer to Figs. 1-3 A big data scheduling method based on digital twins includes the following steps: Collect multi-source scheduling data and preprocess it to generate a multi-source scheduling dataset. Based on the scheduling multi-source dataset, digital twin mapping is performed on task entities and resource entities to determine the corresponding twin object identifiers and their associations, thereby generating a scheduling digital twin. In the scheduling digital twin, the state of the twin object is encoded according to the time index, and historical scheduling trajectory data and real-time operation data are mapped into scheduling state sequences. A set of candidate scheduling paths is generated based on the scheduling state sequences. A scheduling evolution structure is constructed based on the candidate scheduling path set, scheduling states are mapped to structure nodes, state transition relationships are mapped to structure edges, and structure edge constraint parameters are configured based on constraint data to generate a scheduling evolution structure graph. Based on the scheduling evolution structure diagram, the non-retrograde scheduling structure decision quantity is calculated for each candidate scheduling path in the candidate scheduling path set to determine the non-retrograde boundary. Based on the non-retrograde boundary, the non-retrogradeness of the candidate scheduling path is determined, and the path selection result is generated. Based on the path filtering results, inoperable paths are eliminated, the target scheduling path is determined, and scheduling execution instructions are generated.
[0018] In this embodiment, the multi-source scheduling data specifically includes task identification data, task attribute data, resource identification data, resource attribute data, task-resource association data, scheduling constraint data, historical scheduling trajectory data, and real-time operation data. The preprocessing specifically includes time alignment, format normalization, data cleaning, and association integration processing.
[0019] In this embodiment, the generation of the scheduling digital twin specifically includes: Based on the scheduling multi-source dataset, the task-related data is parsed to extract the corresponding task entity set from the scheduling multi-source dataset; Based on the task entity set, and according to the task identifier field in the scheduling multi-source data, a unique task twin object identifier is assigned to each task entity, and task twin object status information is generated based on the data field corresponding to the task entity. Based on the scheduling multi-source dataset, the resource-related data in it is parsed, and the corresponding set of resource entities in the scheduling multi-source dataset is extracted; Based on the set of resource entities, a unique resource twin object identifier is assigned to each resource entity according to the resource identifier field in the scheduling multi-source data, and the status information of the resource twin object is generated based on the data field corresponding to the resource entity. Based on the association records between task entities and resource entities in the multi-source scheduling dataset, the twin object association relationship between task twin object identifiers and resource twin object identifiers is determined, and corresponding association parameter information is configured for each twin object association relationship; The task twin object identifier and its corresponding task twin object status information, the resource twin object identifier and its corresponding resource twin object status information, and the twin object associations are uniformly organized to generate a scheduling digital twin.
[0020] In this embodiment, the generation of the candidate scheduling path set specifically includes: Based on the scheduling digital twin, a set of time indices is determined that are aligned with the timestamps of historical scheduling trajectory data and real-time operation data. This set of time indices is then used as the time reference for the generation of subsequent scheduling states. Under each time index, read the status information of the task twin object corresponding to each task twin object identifier and the status information of the resource twin object corresponding to each resource twin object identifier in the scheduling digital twin, combine them according to the unified status coding rules, and generate the twin object status code under the corresponding time index. Align and fuse historical scheduling trajectory data and real-time operation data records under the corresponding time index with twin object status codes to generate scheduling status under the corresponding time index; The scheduling states generated under each time index are arranged in chronological order to form a scheduling state sequence. Based on the scheduling state sequence, the scheduling states in the scheduling state sequence are combined according to the time continuity rule to generate multiple candidate scheduling paths composed of continuous scheduling states. The generated candidate scheduling paths are uniformly collected and identified to form a candidate scheduling path set.
[0021] In this embodiment, the generation of the scheduling evolution structure diagram specifically includes: For the candidate scheduling path set, the scheduling states arranged in chronological order in each candidate scheduling path are analyzed, and all scheduling states appearing in the candidate scheduling path set are collected to generate a scheduling state set. According to the scheduling states in the scheduling state set, assign a unique structure node identifier to each scheduling state, and map each scheduling state to the corresponding structure node to form a structure node set. Based on the sequential relationship between adjacent scheduling states in the candidate scheduling path set, the state transition relationship between scheduling states is extracted, and each state transition relationship is mapped to a structural edge connecting the corresponding structural node to generate a set of structural edges. For each structural edge in the set of structural edges, determine the starting structural node and the target structural node according to the time order of the scheduling status in the candidate scheduling path, and complete the structural edge direction identification. By combining the constraint fields related to scheduling state transitions in the constraint data, the corresponding structural edge constraint parameters are configured for the state transition relationship corresponding to each structural edge. The configuration of the structural edge constraint parameters specifically includes: Read the constraint fields corresponding to the scheduling state transitions in the candidate scheduling path set from the constraint data; match the constraint fields according to the scheduling state transition relationship represented by the structural edge to determine the constraint field set corresponding to the structural edge; for each structural edge, organize the constraint values in its corresponding constraint field set to form structural edge constraint parameters used to describe the state transition constraint characteristics of the structural edge; configure the corresponding structural edge constraint parameters for each structural edge in the structural edge set, so that each structural edge is associated with at least one structural edge constraint parameter. The set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge are organized in a unified manner to generate a scheduling evolution structure diagram.
[0022] In this embodiment, the generation of the path filtering results specifically includes: Extract the set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge from the scheduling evolution structure graph, and map each candidate scheduling path in the candidate scheduling path set into a path structure representation composed of continuous structural nodes and adjacent structural edges. For each path structure representation, the corresponding structural edge constraint parameters are extracted one by one according to the order of the structural edges in the path structure representation, and the path constraint parameter set corresponding to the candidate scheduling path is generated. Based on the set of path constraint parameters, the constraint satisfaction of each structural edge in the candidate scheduling path is collected to generate an irreversible scheduling structure decision quantity that represents the overall cumulative constraint state of the candidate scheduling path. The generation of the non-reversible scheduling structure decision quantity specifically includes: According to the arrangement order of structural edges in the candidate scheduling path, the constraint parameters of each structural edge in the path constraint parameter set are read one by one; for the constraint parameter corresponding to each structural edge, it is determined whether it satisfies the constraint conditions associated with the structural edge, and the corresponding structural edge constraint state mark is generated; the structural edge constraint state marks corresponding to each structural edge in the same candidate scheduling path are sequentially collected to form the constraint state sequence of the candidate scheduling path; the constraint state sequence is accumulated to generate an irreversible scheduling structure decision quantity that characterizes the degree of constraint accumulation of the candidate scheduling path in the scheduling evolution structure graph; The non-revertible scheduling structure decision quantities corresponding to each candidate scheduling path in the candidate scheduling path set are uniformly collected to determine the non-revertible boundary used to distinguish between revertable and non-revertible paths. The determination of the non-revertible boundary specifically includes: The decision quantities of the non-reversible scheduling structure corresponding to each candidate scheduling path in the candidate scheduling path set are collected to form a decision quantity set; the decision quantities of each non-reversible scheduling structure in the decision quantity set are sorted according to a unified comparison rule; the boundary position used to distinguish the value range of the decision quantity is determined in the sorting result; the value of the decision quantity corresponding to the boundary position is determined as the non-reversible boundary, which is used to distinguish between reversible paths and non-reversible paths. Based on the non-reversible boundary, perform non-reversible determination on each candidate scheduling path in the candidate scheduling path set and generate the corresponding non-reversible determination flag; Collect the non-reversibility judgment flags corresponding to each candidate scheduling path and generate path selection results.
[0023] In this embodiment, the generation of the scheduling execution instruction specifically includes: Read the path filtering results and obtain the executability determination flags corresponding to each candidate scheduling path in the candidate scheduling path set; Based on the executability determination flag, inexecutable paths are filtered out from the candidate scheduling path set, the remaining paths are collected, and an executable scheduling path set is generated. For the set of executable scheduling paths, each executable scheduling path is parsed according to the time index order of the scheduling state, and the corresponding scheduling state sequence is extracted. In the set of executable scheduling paths, each executable scheduling path is compared according to a preset path selection rule to determine the target scheduling path; The generation of the target scheduling path specifically includes: In the set of executable scheduling paths, the scheduling state sequence corresponding to each executable scheduling path is parsed according to the time index order of the scheduling state, and path feature information that characterizes the execution characteristics of the executable scheduling path is extracted; based on the path feature information, each executable scheduling path in the set of executable scheduling paths is compared one by one according to the preset path selection rules; in the comparison results, the executable scheduling path that meets the preset path selection rules is determined as the target scheduling path; Along the target scheduling path, the scheduling state sequence is used to perform state difference analysis on adjacent scheduling states, extract the change items of task twin object state information and resource twin object state information, and generate a set of scheduling actions. The generation of the scheduling action set specifically includes: Along the target scheduling path, the scheduling state sequence is used to sequentially select two adjacent scheduling states in time index order. For each pair of adjacent scheduling states, the corresponding task twin object state information is compared, and the changes in the task twin object state information are extracted. For each pair of adjacent scheduling states, the corresponding resource twin object state information is compared, and the changes in the resource twin object state information are extracted. The changes in the task twin object state information and the changes in the resource twin object state information are combined to form a scheduling action corresponding to that pair of adjacent scheduling states. All scheduling actions corresponding to adjacent scheduling states in the target scheduling path are collected to generate a scheduling action set. The set of scheduling actions is organized in time index order to generate scheduling execution instructions.
[0024] Example 1: To verify the feasibility of the present invention in implementation, it was applied to a large-scale cross-regional business collaborative scheduling scenario. In this scenario, there are a large number of business tasks executed in parallel and multiple types of shared resources. There are complex dependencies between tasks in terms of time sequence, resource consumption and execution conditions. Moreover, the scheduling process needs to be continuously adjusted under constantly changing operating conditions. Due to the interweaving and superposition of historical scheduling trajectories and real-time operating conditions, the scheduling path is prone to cross key constraint boundaries during the evolution process. Once it enters an irreversible state, it will lead to a sharp compression of the subsequent scheduling adjustment space and even cause long-term operational instability. Existing scheduling methods usually only discover the above problems after scheduling is executed, making it difficult to avoid risks in a timely manner.
[0025] In this scenario, scheduling-related information from different business units is first collected. The collected data covers task identifiers, task attributes, resource identifiers, resource attributes, the relationship between tasks and resources, scheduling constraints, historical scheduling trajectories, and real-time operating status. After time alignment, format unification, cleaning, and correlation integration, the above multi-source scheduling data forms a multi-source scheduling dataset, providing a consistent data foundation for subsequent scheduling analysis. Based on this, the task-related information in the multi-source scheduling dataset is parsed to extract task entity sets, and a corresponding twin object identifier is assigned to each task entity to generate task twin object status information reflecting the task execution status. At the same time, resource-related information is parsed to extract resource entity sets, and a corresponding twin object identifier is assigned to each resource entity to generate resource twin object status information reflecting the resource occupancy and availability status. By further analyzing the association records between task entities and resource entities, the association relationships between twin objects are determined, and the above information is uniformly organized to construct a scheduling digital twin.
[0026] In the scheduling digital twin, a unified set of time indices is generated based on the timestamps of historical scheduling trajectories and real-time operating status. This set serves as the time benchmark for generating scheduling states. Under each time index, the status information of the corresponding task twin object and the status information of the resource twin object are combined according to a unified rule to generate a scheduling state that reflects the overall current scheduling state. As the time index progresses, the continuously generated scheduling states are arranged into a scheduling state sequence and combined under the constraint of time continuity to form multiple candidate scheduling paths. These candidate scheduling paths fully reflect the possible evolution direction of the scheduling process and provide input for subsequent structural analysis.
[0027] The candidate scheduling path set is analyzed, and all scheduling states are collected. Each scheduling state is mapped to a unique structure node. Simultaneously, state transition relationships are extracted based on the sequence of scheduling states and mapped to structural edges connecting the structure nodes. Combining the constraint fields related to state transitions in the scheduling constraint data, corresponding structural edge constraint parameters are configured for each structural edge. This ensures that the scheduling evolution structure not only describes the topological relationships of state evolution but also carries scheduling constraint information. By uniformly organizing structural nodes, structural edges, and their constraint parameters, a scheduling evolution structure graph is generated, characterizing the scheduling process from an overall structural perspective. Based on this, candidate scheduling paths are mapped to path structure representations consisting of structural nodes and structural edges. Corresponding structural edge constraint parameters are extracted sequentially along the path to form a set of path constraint parameters. By aggregating the set of path constraint parameters, a non-reversible scheduling structure decision quantity reflecting the cumulative state of the overall constraints of the path is obtained. A unified analysis is then performed among candidate scheduling paths to determine the non-reversible boundary used to distinguish between reversible and non-reversible paths. In this way, scheduling paths that may enter a non-reversible state during the evolution process can be identified before scheduling execution and marked, thereby generating path selection results.
[0028] During the scheduling execution phase, based on the path filtering results, inexecutable paths are prioritized for elimination, leaving only executable scheduling paths. Within the set of executable scheduling paths, the target scheduling path is determined by comparing the scheduling state sequences of each path. Along the scheduling state sequence of the target scheduling path, adjacent scheduling states are compared and analyzed to identify changes in the status information of task twin objects and resource twin objects before and after the current state. Based on this, a corresponding set of scheduling actions is generated, ultimately forming a scheduling execution instruction. Through this implementation method, scheduling decisions no longer rely on post-event adjustments, but rather identify and avoid irreversible risks before execution, ensuring that the scheduling process always operates within a controllable structural range.
[0029] During the continuous operation of this application scenario, it can be observed that the scheduling adjustment is more stable. By completing the analysis of the scheduling path structure risk before scheduling execution, the present invention effectively avoids the long-term instability problem caused by the non-reversible structure during the scheduling process, demonstrating the beneficial effect of improving the operational stability and reliability in complex scheduling environments.
[0030] Table 1. Overall performance comparison between the method of this invention and traditional scheduling methods in complex scheduling scenarios.
[0031] As can be seen from Table 1, in terms of the average number of backtracking operations for scheduling paths, traditional scheduling methods still frequently trigger path backtracking operations during scheduling execution, indicating that their scheduling decisions rely more on post-event correction mechanisms to maintain feasibility. In contrast, this invention constructs a scheduling evolution structure diagram before scheduling execution and introduces a non-backtracking determination mechanism at the candidate scheduling path level, so that potentially high-risk paths are identified and eliminated before entering the execution stage, thereby significantly reducing the dependence on backtracking operations during the execution stage and making the overall scheduling process more coherent and stable.
[0032] In terms of risk path identification rate before scheduling execution, the method of this invention shows a significant advantage over traditional methods. Traditional methods are usually based on constraint checks or local state judgments at a single point in time, and lack the ability to systematically characterize the risks that gradually accumulate in the long-term evolution of scheduling paths. Therefore, they can only identify some explicit risk paths before execution. This invention maps scheduling constraints directly to scheduling state transition relationships and performs overall structural analysis on candidate scheduling paths, enabling problems that are difficult to appear in the early stages to be exposed in advance, thereby significantly improving the ability to identify risk paths before scheduling execution.
[0033] The changes in the passive adjustment trigger rate and the proportion of scheduling failure backtracking during scheduling execution further confirm the above effects. Since the present invention has completed the screening of non-reversible paths before scheduling execution, the scheduling paths that enter the execution stage have better structural sustainability. Therefore, the occurrence of passive adjustment and failure backtracking is significantly reduced in actual operation. This improvement is not achieved by tightening scheduling conditions, but by avoiding structural risks in advance, so that the scheduling process can significantly improve stability while maintaining flexibility.
[0034] As can be seen from the average stable duration of the scheduling path and the occurrence rate of scheduling constraint conflicts, the target scheduling path generated by this invention can maintain a good fit with the scheduling constraints over a long period of time. The fundamental reason is that this invention does not treat the constraints as static restrictions, but integrates them into the structural representation of the scheduling state transition, so that the influence of the constraints on the evolution of the scheduling path can be continuously tracked and cumulatively evaluated, thereby effectively reducing the probability of constraint conflicts during the scheduling process.
[0035] The significant decrease in the state fluctuation amplitude index during the scheduling execution process indicates that the present invention fully considers the continuous evolution characteristics of the scheduling state in the time dimension during the scheduling action generation stage. By analyzing the changes in adjacent scheduling states, scheduling actions are generated, making the scheduling execution smoother and avoiding the drastic state fluctuations caused by local decisions in traditional methods. Overall, the present invention, by introducing a structured risk assessment and path selection mechanism before scheduling execution, realizes the transformation from "post-event remedial scheduling" to "pre-event avoidance scheduling," significantly improving the stability, reliability, and sustainable operation capability of the scheduling process in complex scheduling environments.
[0036] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A big data scheduling method based on digital twins, characterized in that, Includes the following steps: Collect multi-source scheduling data and preprocess it to generate a multi-source scheduling dataset. Based on the scheduling multi-source dataset, digital twin mapping is performed on task entities and resource entities to determine the corresponding twin object identifiers and their associations, thereby generating a scheduling digital twin. In the scheduling digital twin, the state of the twin object is encoded according to the time index, and historical scheduling trajectory data and real-time operation data are mapped into scheduling state sequences. A set of candidate scheduling paths is generated based on the scheduling state sequences. A scheduling evolution structure is constructed based on the candidate scheduling path set, scheduling states are mapped to structure nodes, state transition relationships are mapped to structure edges, and structure edge constraint parameters are configured based on constraint data to generate a scheduling evolution structure graph. Based on the scheduling evolution structure diagram, the non-retrograde scheduling structure decision quantity is calculated for each candidate scheduling path in the candidate scheduling path set to determine the non-retrograde boundary. Based on the non-retrograde boundary, the non-retrogradeness of the candidate scheduling path is determined, and the path selection result is generated. Based on the path filtering results, inoperable paths are eliminated, the target scheduling path is determined, and scheduling execution instructions are generated.
2. The big data scheduling method based on digital twins according to claim 1, characterized in that, The multi-source scheduling data specifically includes task identification data, task attribute data, resource identification data, resource attribute data, task-resource association data, scheduling constraint data, historical scheduling trajectory data, and real-time operation data. The preprocessing specifically includes time alignment, format standardization, data cleaning, and association integration processing.
3. The big data scheduling method based on digital twins according to claim 1, characterized in that, The generation of the scheduling digital twin specifically includes: Based on the scheduling multi-source dataset, the task-related data is parsed to extract the corresponding task entity set from the scheduling multi-source dataset; Based on the task entity set, and according to the task identifier field in the scheduling multi-source data, a unique task twin object identifier is assigned to each task entity, and task twin object status information is generated based on the data field corresponding to the task entity. Based on the scheduling multi-source dataset, the resource-related data in it is parsed, and the corresponding set of resource entities in the scheduling multi-source dataset is extracted; Based on the set of resource entities, a unique resource twin object identifier is assigned to each resource entity according to the resource identifier field in the scheduling multi-source data, and the status information of the resource twin object is generated based on the data field corresponding to the resource entity. Based on the association records between task entities and resource entities in the multi-source scheduling dataset, the twin object association relationship between task twin object identifiers and resource twin object identifiers is determined, and corresponding association parameter information is configured for each twin object association relationship; The task twin object identifier and its corresponding task twin object status information, the resource twin object identifier and its corresponding resource twin object status information, and the twin object associations are uniformly organized to generate a scheduling digital twin.
4. The big data scheduling method based on digital twins according to claim 1, characterized in that, The generation of the candidate scheduling path set specifically includes: Based on the scheduling digital twin, a set of time indices is determined that are aligned with the timestamps of historical scheduling trajectory data and real-time operation data. This set of time indices is then used as the time reference for the generation of subsequent scheduling states. Under each time index, read the status information of the task twin object corresponding to each task twin object identifier and the status information of the resource twin object corresponding to each resource twin object identifier in the scheduling digital twin, combine them according to the unified status coding rules, and generate the twin object status code under the corresponding time index. Align and fuse historical scheduling trajectory data and real-time operation data records under the corresponding time index with twin object status codes to generate scheduling status under the corresponding time index; The scheduling states generated under each time index are arranged in chronological order to form a scheduling state sequence. Based on the scheduling state sequence, the scheduling states in the scheduling state sequence are combined according to the time continuity rule to generate multiple candidate scheduling paths composed of continuous scheduling states. The generated candidate scheduling paths are uniformly collected and identified to form a candidate scheduling path set.
5. The big data scheduling method based on digital twins according to claim 1, characterized in that, The generation of the scheduling evolution structure diagram specifically includes: For the candidate scheduling path set, the scheduling states arranged in chronological order in each candidate scheduling path are analyzed, and all scheduling states appearing in the candidate scheduling path set are collected to generate a scheduling state set. According to the scheduling states in the scheduling state set, assign a unique structure node identifier to each scheduling state, and map each scheduling state to the corresponding structure node to form a structure node set. Based on the sequential relationship between adjacent scheduling states in the candidate scheduling path set, the state transition relationship between scheduling states is extracted, and each state transition relationship is mapped to a structural edge connecting the corresponding structural node to generate a set of structural edges. For each structural edge in the set of structural edges, determine the starting structural node and the target structural node according to the time order of the scheduling status in the candidate scheduling path, and complete the structural edge direction identification. By combining the constraint fields related to scheduling state transitions in the constraint data, the corresponding structural edge constraint parameters are configured for the state transition relationship corresponding to each structural edge. The set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge are organized in a unified manner to generate a scheduling evolution structure diagram.
6. The big data scheduling method based on digital twins according to claim 1, characterized in that, The generation of the path filtering results specifically includes: Extract the set of structural nodes, the set of structural edges, and the structural edge constraint parameters corresponding to each structural edge from the scheduling evolution structure graph, and map each candidate scheduling path in the candidate scheduling path set into a path structure representation composed of continuous structural nodes and adjacent structural edges. For each path structure representation, the corresponding structural edge constraint parameters are extracted one by one according to the order of the structural edges in the path structure representation, and the path constraint parameter set corresponding to the candidate scheduling path is generated. Based on the set of path constraint parameters, the constraint satisfaction of each structural edge in the candidate scheduling path is collected to generate an irreversible scheduling structure decision quantity that represents the overall cumulative constraint state of the candidate scheduling path. The non-revertible scheduling structure decision quantities corresponding to each candidate scheduling path in the candidate scheduling path set are uniformly collected to determine the non-revertible boundary used to distinguish between revertable and non-revertible paths. Based on the non-reversible boundary, perform non-reversible determination on each candidate scheduling path in the candidate scheduling path set and generate the corresponding non-reversible determination flag; Collect the non-reversibility judgment flags corresponding to each candidate scheduling path and generate path selection results.
7. The big data scheduling method based on digital twins according to claim 1, characterized in that, The generation of the scheduling execution instruction specifically includes: Read the path filtering results and obtain the executability determination flags corresponding to each candidate scheduling path in the candidate scheduling path set; Based on the executability determination flag, inexecutable paths are filtered out from the candidate scheduling path set, the remaining paths are collected, and an executable scheduling path set is generated. For the set of executable scheduling paths, each executable scheduling path is parsed according to the time index order of the scheduling state, and the corresponding scheduling state sequence is extracted. In the set of executable scheduling paths, each executable scheduling path is compared according to a preset path selection rule to determine the target scheduling path; Along the target scheduling path, the scheduling state sequence is used to perform state difference analysis on adjacent scheduling states, extract the change items of task twin object state information and resource twin object state information, and generate a set of scheduling actions. The set of scheduling actions is organized in time index order to generate scheduling execution instructions.