Supervision knowledge graph dynamic modeling method based on multi-source heterogeneous data fusion
By constructing a four-dimensional knowledge graph of multi-source heterogeneous data and adopting graph neural networks and virtual feature compensation mechanisms, the problems of incomplete knowledge coverage and lagging compliance inspection under dynamic changes in engineering were solved, realizing dynamic adaptation of supervision data and real-time compliance assessment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JIANGSU YINTAISI INFORMATION TECH CO LTD
- Filing Date
- 2025-07-23
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies cannot adapt to dynamic changes in engineering and lack a compensation mechanism for scenarios with missing multimodal data, resulting in blind spots in knowledge coverage, delayed compliance checks, and high rectification costs.
A four-dimensional knowledge graph based on multi-source heterogeneous data is constructed. Graph neural networks are used to align entities, remote supervised learning is used to extract relationships, weighted fusion algorithms are used to process attributes, and a virtual feature compensation mechanism is used to fill data gaps. A compliance entropy evaluation model is introduced, and a reinforcement learning process is automatically triggered.
It enables dynamic fusion and compliance assessment of multi-source heterogeneous data, solves the problem of incomplete knowledge coverage, improves the comprehensiveness and real-time nature of supervision decisions, and reduces the lag and cost of compliance inspections.
Smart Images

Figure CN120874991B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of information engineering knowledge graph construction technology, and more specifically, to a dynamic modeling method for supervision knowledge graphs based on the fusion of multi-source heterogeneous data. Background Technology
[0002] With the expansion of information technology projects (such as data center construction and smart government system deployment) and the acceleration of technological iteration, supervision work needs to integrate multi-source heterogeneous data, including spatial dimensions (BIM models, physical locations of equipment), temporal dimensions (real-time sensor status, process sequence), business dimensions (contract terms, progress flow), textual dimensions (supervision logs, specification documents), and visual dimensions (construction images, quality defects). However, existing technologies have significant limitations: knowledge graphs are mostly statically constructed and cannot adapt to dynamic changes in projects (such as equipment relocation and rule updates), and lack effective compensation mechanisms for scenarios with missing multimodal data (such as quality issues without on-site photos or status gaps caused by sensor malfunctions), resulting in blind spots in knowledge coverage and restricting the comprehensiveness and real-time nature of supervision decisions.
[0003] In terms of dynamic evolution and compliance management, existing technologies are lagging behind: ① Lack of knowledge dynamism: Knowledge graphs are not linked to timestamps, making it impossible to trace the evolution of "device status changes" and "rule updates," and making it difficult to support retrospective analysis; ② Gap in multimodal compensation: Faced with the lack of modalities such as images, spatial coordinates, and text, no historical case retrieval and feature fusion mechanism has been established, resulting in data gaps that cannot be filled; ③ Inefficient compliance inspections: Relying on manual spot checks, it is impossible to quantify compliance status (such as rule coverage and data consistency), let alone automatically trigger closed-loop processes such as rule optimization and data repair, resulting in a lag in the discovery of compliance risks (on average, a lag of 2-3 process nodes) and a sharp increase in rectification costs.
[0004] In summary, there is an urgent need to construct a supervision knowledge graph modeling method that supports dynamic fusion of multi-source heterogeneous data, multi-modal missing data compensation, intelligent compliance assessment, and self-optimization. Summary of the Invention
[0005] To address the shortcomings of existing technologies, the purpose of this invention is to provide a dynamic modeling method for supervision knowledge graphs based on the fusion of multi-source heterogeneous data.
[0006] To achieve the above objectives, the present invention provides the following technical solution:
[0007] The dynamic modeling method for supervision knowledge graph based on multi-source heterogeneous data fusion includes the following steps:
[0008] S1: Real-time collection of multi-source heterogeneous data on information engineering;
[0009] S2: Based on multi-source heterogeneous data from information engineering, construct a four-dimensional knowledge graph modeling framework that includes entities, relationships, attributes, and timestamps;
[0010] Based on multi-source heterogeneous data from information engineering projects, a four-dimensional knowledge graph modeling framework is constructed, consisting of entities, relationships, attributes, and timestamps (each knowledge unit (entity, relationship, attribute) has a precise timestamp attached to its existence and changes, supporting the tracing of knowledge evolution along a timeline); S21: A graph neural network is constructed using the DGL library, and node embedding technology is used to align the same entity from different sources (e.g., identifying "Server A" in BIM and "Device ID-001" in sensor data as the same entity); S22: Remote supervised learning is used to extract relationships from supervision documents (e.g., "According to Article 5.2 of the Design Code, server deployment must meet heat dissipation requirements," generating the relationship (server deployment, must meet, heat dissipation requirements)), and assigning weights to relationships based on document authority; S23: For multi-source attributes of the same entity (e.g., BIM coordinates and sensor location data), a weighted fusion algorithm is used (weights are determined by data credibility), and a collection timestamp is added to the fusion result.
[0011] S3: When the modal attributes of events to be updated are missing during the updating of the four-dimensional knowledge graph modeling framework, virtual features are mapped in the four-dimensional knowledge graph modeling framework.
[0012] S4: Construct a compliance entropy evaluation model, periodically calculate the compliance entropy value of the supervision knowledge graph, and automatically trigger the reinforcement learning process when the compliance entropy value is lower than the threshold.
[0013] Furthermore, multi-source heterogeneous data includes spatial dimension data, temporal dimension data, business dimension data, text dimension data, and visual dimension data.
[0014] Spatial Dimension Data: A three-dimensional spatial coordinate system is constructed using BIM, LiDAR, and SLAM algorithms to achieve millimeter-level accuracy in equipment positioning (e.g., server rack coordinate error < ±2mm).
[0015] Time-dimensional data: IoT sensor data (such as network traffic and CPU load) are sampled at the millisecond level, and noise is filtered out using the Kalman filter algorithm to retain valid data (confidence level > 95%).
[0016] Business-level data: By integrating information engineering management systems (such as Jira and TAPD) through API gateways, structured data such as contract terms and schedules are automatically extracted, and unstructured requirements in change logs are parsed using natural language processing (NLP).
[0017] Text-based data: A dynamic knowledge graph generator is built based on LangChain+Neo4j. GPT-4 is used to automatically parse the supervision logs and expert experience documents to extract risk descriptions and disposal plans, and generate knowledge nodes with timestamps.
[0018] Visual dimension data: On-site cameras equipped with YOLOv8 monitor construction quality in real time (such as missing cable markings and equipment installation deviations). Combined with the CLIP model, image features are aligned with the spatial coordinates of the BIM model to generate a chain of evidence with geographical location.
[0019] Furthermore, the construction process of the four-dimensional knowledge graph modeling framework is as follows: S21: Use the DGL library to build a graph neural network and align the same entity from different sources through node embedding technology; S22: Use remote supervised learning to extract relations from the supervision documents and assign relation weights according to the authority of the documents; S23: For the multi-source attributes of the same entity, use a weighted fusion algorithm and add a collection timestamp to the fusion result.
[0020] Further, the process of mapping virtual features: S31: When the knowledge graph receives an event to be updated, it parses the modal attributes of the event to be updated.
[0021] S32: When there is a lack of modal attributes in the image or spatial coordinates of the event to be updated, the compensation mechanism is activated.
[0022] S33: After activating the compensation mechanism, determine the number of missing modal attributes of the event to be updated (the number of missing modal attributes of the event to be updated: the number of missing modal attributes is determined by the common absence of any one or more of the following: image, spatial coordinates, and text). Set search conditions (construct search conditions based on the missing modal attributes; for example, when spatial coordinates are missing, the constructed search conditions are as follows: Entity type constraint: only search for entities with type="QualityIssue" and subtype="cable problem"; Relationship constraint: associate with the relationship "belongs to component type" as "server rack"; Attribute constraint: text description). CLIP similarity > 0.75; if images and text are missing, the constructed search conditions are as follows: Entity type constraint: type="QualityIssue" (locating quality problem entities); Relationship constraint: associating "belongs to component type" as the target type (e.g., "server rack"); associating "occurred in process" as the current process (e.g., "cable laying"); Attribute constraint: Spatial constraint: BIM coordinate spatial distance < 2 meters (using known spatial coordinates), based on the search conditions, the updated events within the previous T time period are filtered, and the updated events that meet the search conditions are marked as comparison events.
[0023] S34: Determine the weights of each fused feature (e.g., if the image is missing in the event to be updated, define the semantic similarity coefficient as 0.6, the spatial distance coefficient as 0.3, and the time decay coefficient as 0.1; if the spatial coordinates are missing in the event to be updated, define the semantic similarity coefficient as 0.7 and the time decay coefficient as 0.3; if the text is missing in the event to be updated, define the spatial distance coefficient as 0.75 and the time decay coefficient as 0.25; if both the image and spatial coordinates are missing in the event to be updated, define the semantic similarity coefficient as...). The event weights are determined by setting the time decay coefficient to 0.8. For example, if the event to be updated has missing image and text, the spatial distance coefficient is defined as 0.6, and the time decay coefficient is defined as 0.4. If the event to be updated has missing both spatial and text elements, the time decay coefficient is defined as 1 (only the time dimension is considered when bimodal loss occurs). This determines the event weights for each comparison event and further determines the virtual features (e.g., if the event to be updated has a missing image, comparison event A has a semantic similarity of 0.9, a spatial distance of 1 meter, and a time difference of 15 days; the event weight of comparison event A is...). 0.9 represents semantic similarity, calculated using the CLIP model with cosine similarity. The spatial distance parameter is calculated using the formula: Spatial distance coefficient = 1 - (Actual distance / Maximum effective distance), where the maximum effective distance is typically set to 2 meters (this can be adjusted according to project accuracy requirements). For the time parameter, the calculation formula is: Time decay weight = 0.95^(Time difference / 30), where "0.95" represents a 5% decay per month (which can be adjusted according to the project cycle characteristics), and "30" is the time unit (days / months). The semantic similarity of the updated event A to event B is 0.85, the spatial distance is 0.5 meters, and the time difference is 60 days. Therefore, event B is... Virtual features = ( ×Updated image features for Event A+ ×Image features of updated event B) / ( + The determined virtual features are mapped onto the supervision knowledge graph.
[0024] Furthermore, the calculation process for the compliance entropy value of the supervision knowledge graph is as follows: S41: Define rule coverage, data consistency, and traceability integrity;
[0025] S42: Normalize rule coverage, data consistency, and traceability integrity, and define them as x1, x2, and x3 respectively. Calculate the compliance entropy value, where, .
[0026] Furthermore, rule coverage: the ratio of the number of matched rules to the total number of rules. ; The number of supervision rules that were successfully matched and executed within the current data processing cycle. The total number of predefined supervision rules in the system.
[0027] Furthermore, data consistency: calculated through the proportion of conflicting data. ; The number of data records that violate the entity relationship constraints of the knowledge graph, i.e., "conflicting data"; The total number of all data records in the current period.
[0028] Furthermore, traceability integrity: ; The length of a data chain with a complete data lineage record, i.e., the number of traceable records throughout the entire process from data collection, processing, application to archiving; The total number of data chains generated in the current period.
[0029] Compared with the prior art, the present invention has the following beneficial effects:
[0030] This invention constructs a multi-source heterogeneous data acquisition system, integrating spatial, temporal, business, textual, and visual data to achieve comprehensive coverage and accurate analysis of information-based engineering supervision data. Based on this, a four-dimensional knowledge graph modeling framework is designed, combining entity alignment from graph neural networks, relation extraction from remote supervision, and weighted fusion attribute processing to construct a dynamically evolving, cross-source consistent knowledge network. For scenarios with missing multimodal data, a virtual feature compensation mechanism (historical event retrieval, multi-dimensional feature weighted fusion) generates virtual features and maps them to the graph, effectively filling data gaps and solving the problem of incomplete knowledge coverage in complex engineering scenarios. A compliance entropy evaluation model is introduced to quantify the compliance status of the supervision knowledge graph from three dimensions: rule coverage, data consistency, and traceability integrity. This overcomes the limitations of lag and subjectivity in traditional manual compliance checks. When the entropy value falls below a threshold, a reinforcement learning closed-loop process is automatically triggered, achieving dynamic adaptation of supervision rules, autonomous improvement of data quality, and full-chain coverage of knowledge traceability. Attached Figure Description
[0031] Figure 1 A flowchart illustrating the method for dynamic modeling of supervisory knowledge graphs based on multi-source heterogeneous data fusion;
[0032] Figure 2 Flowchart for constructing a four-dimensional knowledge graph modeling framework;
[0033] Figure 3 This is a flowchart for mapping virtual features. Detailed Implementation
[0034] Reference Figures 1 to 3 A dynamic modeling method for supervision knowledge graphs based on multi-source heterogeneous data fusion includes the following steps:
[0035] S1: Real-time collection of multi-source heterogeneous data on information engineering; multi-source heterogeneous data includes spatial dimension data, time dimension data, business dimension data, text dimension data, and visual dimension data.
[0036] Spatial Dimension Data: A three-dimensional spatial coordinate system is constructed using BIM, LiDAR, and SLAM algorithms to achieve millimeter-level accuracy in equipment positioning (e.g., server rack coordinate error < ±2mm).
[0037] Time-dimensional data: IoT sensor data (such as network traffic and CPU load) are sampled at the millisecond level, and noise is filtered out using the Kalman filter algorithm to retain valid data (confidence level > 95%).
[0038] Business-level data: By integrating information engineering management systems (such as Jira and TAPD) through API gateways, structured data such as contract terms and schedules are automatically extracted, and unstructured requirements in change logs are parsed using natural language processing (NLP).
[0039] Text-based data: A dynamic knowledge graph generator is built based on LangChain+Neo4j. GPT-4 is used to automatically parse the supervision logs and expert experience documents to extract risk descriptions and disposal plans, and generate knowledge nodes with timestamps.
[0040] Visual dimension data: On-site cameras equipped with YOLOv8 monitor construction quality in real time (such as missing cable markings and equipment installation deviations). Combined with the CLIP model, image features are aligned with the spatial coordinates of the BIM model to generate a chain of evidence with geographical location.
[0041] S2: Based on multi-source heterogeneous data from information engineering projects, construct a four-dimensional knowledge graph modeling framework consisting of entities, relationships, attributes, and timestamps (each knowledge unit (entity, relationship, attribute) has a precise timestamp attached to its existence and changes, supporting the tracing of knowledge evolution along a timeline); S21: Use the DGL library to construct a graph neural network, and align the same entity from different sources through node embedding technology (e.g., identifying "Server A" in BIM and "Device ID-001" in sensor data as the same entity); S22: Use remote supervised learning to extract relationships from supervision documents (e.g., "According to Article 5.2 of the Design Code, server deployment must meet heat dissipation requirements," generating the relationship (server deployment, must meet, heat dissipation requirements)), and assigning relationship weights based on document authority; S23: For multi-source attributes of the same entity (e.g., BIM coordinates and sensor location data), use a weighted fusion algorithm (weights are determined by data credibility), and add a collection timestamp to the fusion result.
[0042] S3: When the modal attributes of events to be updated are missing during the updating of the four-dimensional knowledge graph modeling framework, virtual features are mapped in the four-dimensional knowledge graph modeling framework.
[0043] S31: When the knowledge graph receives an event to be updated, it parses the modal attributes of the event to be updated.
[0044] S32: When there is a lack of modal attributes in the image or spatial coordinates of the event to be updated, the compensation mechanism is activated.
[0045] S33: After activating the compensation mechanism, determine the number of missing modal attributes of the event to be updated (the number of missing modal attributes of the event to be updated: the number of missing modal attributes is determined by the common absence of any one or more of the following: image, spatial coordinates, and text). Set search conditions (construct search conditions based on the missing modal attributes; for example, when spatial coordinates are missing, the constructed search conditions are as follows: Entity type constraint: only search for entities with type="QualityIssue" and subtype="cable problem"; Relationship constraint: associate with the relationship "belongs to component type" as "server rack"; Attribute constraint: text description). CLIP similarity > 0.75; if images and text are missing, the constructed search conditions are as follows: Entity type constraint: type="QualityIssue" (locating quality problem entities); Relationship constraint: associating "belongs to component type" as the target type (e.g., "server rack"); associating "occurred in process" as the current process (e.g., "cable laying"); Attribute constraint: Spatial constraint: BIM coordinate spatial distance < 2 meters (using known spatial coordinates), based on the search conditions, the updated events within the previous T time period are filtered, and the updated events that meet the search conditions are marked as comparison events.
[0046] S34: Determine the weights of each fused feature (e.g., if the image is missing in the event to be updated, define the semantic similarity coefficient as 0.6, the spatial distance coefficient as 0.3, and the time decay coefficient as 0.1; if the spatial coordinates are missing in the event to be updated, define the semantic similarity coefficient as 0.7 and the time decay coefficient as 0.3; if the text is missing in the event to be updated, define the spatial distance coefficient as 0.75 and the time decay coefficient as 0.25; if both the image and spatial coordinates are missing in the event to be updated, define the semantic similarity coefficient as...). The event weights are determined by setting the time decay coefficient to 0.8. For example, if the event to be updated has missing image and text, the spatial distance coefficient is defined as 0.6, and the time decay coefficient is defined as 0.4. If the event to be updated has missing both spatial and text elements, the time decay coefficient is defined as 1 (only the time dimension is considered when bimodal loss occurs). This determines the event weights for each comparison event and further determines the virtual features (e.g., if the event to be updated has a missing image, comparison event A has a semantic similarity of 0.9, a spatial distance of 1 meter, and a time difference of 15 days; the event weight of comparison event A is...). 0.9 represents semantic similarity, calculated using the CLIP model with cosine similarity. The spatial distance parameter is calculated using the formula: Spatial distance coefficient = 1 - (Actual distance / Maximum effective distance), where the maximum effective distance is typically set to 2 meters (this can be adjusted according to project accuracy requirements). For the time parameter, the calculation formula is: Time decay weight = 0.95^(Time difference / 30), where "0.95" represents a 5% decay per month (which can be adjusted according to the project cycle characteristics), and "30" is the time unit (days / months). The semantic similarity of the updated event A to event B is 0.85, the spatial distance is 0.5 meters, and the time difference is 60 days. Therefore, event B is... Virtual features = ( ×Updated image features for Event A+ ×Image features of updated event B) / ( + The determined virtual features are mapped onto the supervision knowledge graph.
[0047] S4: Construct a compliance entropy assessment model, periodically calculate the compliance entropy value of the supervision knowledge graph, and automatically trigger the reinforcement learning process when the compliance entropy value falls below a threshold (rule optimization: for uncovered rules). Generate candidate rule sets and clean the data, including conflicting data. Initiate automatic repair process to trace and complete missing lineage data. Triggered blockchain evidence storage supplementation).
[0048] S41: Define rule coverage: the ratio of the number of matched rules to the total number of rules. ; The number of supervision rules that were successfully matched and executed within the current data processing cycle. The total number of predefined supervision rules in the system includes basic standards (national / industry standards), project-customized rules, expert experience rules, etc. For example, if the system has 100 supervision rules and currently matches 85, then... Define data consistency: calculated based on the percentage of conflicting data. ; The number of data records that violate the entity relationship constraints of the knowledge graph, i.e., "conflicting data"; : Total number of all data records in the current period (covering structured, semi-structured, and unstructured data); Defines traceability integrity: ; The length of a data chain with a complete data lineage record, i.e., the number of traceable records throughout the entire process from data collection, processing, application to archiving; The total number of data chains generated within the current period, regardless of whether they have complete traceability information. For example, 100 device status data entries form 100 data chains.
[0049] S42: Normalize rule coverage, data consistency, and traceability integrity, and define them as x1, x2, and x3 respectively. Calculate the compliance entropy value, where, .
[0050] The aforementioned method constructs a multi-source heterogeneous data acquisition system, integrating spatial, temporal, business, textual, and visual data to achieve comprehensive coverage and accurate analysis of information-based engineering supervision data. Based on this, a four-dimensional knowledge graph modeling framework is designed, combining entity alignment from graph neural networks, relation extraction from remote supervision, and weighted fusion attribute processing to construct a dynamically evolving, cross-source consistent knowledge network. For scenarios with missing multimodal data, a virtual feature compensation mechanism (historical event retrieval, multi-dimensional feature weighted fusion) generates virtual features and maps them to the graph, effectively filling data gaps and solving the problem of incomplete knowledge coverage in complex engineering scenarios. A compliance entropy evaluation model is introduced to quantify the compliance status of the supervision knowledge graph from three dimensions: rule coverage, data consistency, and traceability integrity. This overcomes the limitations of traditional manual compliance checks due to their lag and subjectivity. When the entropy value falls below a threshold, a reinforcement learning closed-loop process is automatically triggered, achieving dynamic adaptation of supervision rules, autonomous improvement of data quality, and full-chain coverage of knowledge traceability.
[0051] The above formulas are all dimensionless calculations, and the preset parameters in the formulas should be set by those skilled in the art according to the actual situation.
[0052] The above embodiments can be implemented, in whole or in part, by software, hardware, firmware, or any other combination thereof. When implemented using software, the above embodiments can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that includes one or more sets of available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. A semiconductor medium can be a solid-state drive.
[0053] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
[0054] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0055] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
[0056] In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0057] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0058] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A dynamic modeling method for supervision knowledge graph based on multi-source heterogeneous data fusion, characterized in that, Includes the following steps: S1: Real-time collection of multi-source heterogeneous data on information engineering; S2: Based on multi-source heterogeneous data from information engineering, construct a four-dimensional knowledge graph modeling framework that includes entities, relationships, attributes, and timestamps; S3: When the modal attributes of events to be updated are missing during the updating of the four-dimensional knowledge graph modeling framework, virtual features are mapped in the four-dimensional knowledge graph modeling framework. The absence of modal attributes is determined by the combined absence of one or more of the following: image, spatial coordinates, and text. The process of mapping virtual features: S31: When the knowledge graph receives an event to be updated, it parses the modal attributes of the event to be updated; S32: When there is a lack of modal attributes in the image or spatial coordinates of the event to be updated, the compensation mechanism is activated; S33: After the compensation mechanism is activated, determine the number of missing modal attributes of the event to be updated, set search conditions, filter the updated events within the previous T time period based on the search conditions, and mark the updated events that meet the search conditions as comparison events. S34: Determine the weight of each fusion feature, determine the event weight of each comparison event, further determine the virtual features, and map the determined virtual features into the supervision knowledge graph; S4: Construct a compliance entropy value assessment model, periodically calculate the compliance entropy value of the supervision knowledge graph, and automatically trigger the reinforcement learning process when the compliance entropy value is lower than the threshold. The calculation process for the compliance entropy value of the supervisor's knowledge graph: S41: Define rule coverage, data consistency, and traceability integrity; S42: Normalize rule coverage, data consistency, and traceability integrity, and define them as x1, x2, and x3 respectively. Calculate the compliance entropy value, where, ; Rule coverage: The ratio of the number of matched rules to the total number of rules. ; The number of supervision rules that were successfully matched and executed within the current data processing cycle. The total number of predefined supervision rules in the system; Data consistency: calculated based on the percentage of conflicting data. ; The number of data records that violate the entity relationship constraints of the knowledge graph, i.e., "conflicting data"; The total number of all data records within the current period; Traceability integrity: ; The length of a data chain with a complete data lineage record, i.e., the number of traceable records throughout the entire process from data collection, processing, application to archiving; The total number of data chains generated in the current period.
2. The method for dynamic modeling of supervision knowledge graph based on multi-source heterogeneous data fusion according to claim 1, characterized in that, Multi-source heterogeneous data includes spatial dimension data, temporal dimension data, business dimension data, text dimension data, and visual dimension data.
3. The method for dynamic modeling of supervision knowledge graph based on multi-source heterogeneous data fusion according to claim 1, characterized in that, The construction process of the four-dimensional knowledge graph modeling framework is as follows: S21: Use the DGL library to build a graph neural network and align the same entity from different sources through node embedding technology; S22: Use remote supervised learning to extract relations from the supervision documents and assign relation weights according to the authority of the documents; S23: For the multi-source attributes of the same entity, use a weighted fusion algorithm and add a collection timestamp to the fusion result.