Knowledge graph-based hydrological integrated compilation result logical error automatic checking method

By constructing a hydrological knowledge graph model and implementing entity alignment processing, the accuracy and efficiency issues of logical error verification in existing hydrological compilation results have been resolved. This has enabled global and relational modeling of hydrological data, improving the accuracy and efficiency of verification results.

CN122240691APending Publication Date: 2026-06-19YUCI HYDROLOGY & WATER RESOURCES SURVEY BUREAU OF YELLOW RIVER WATER RESOURCES COMMISSION

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
YUCI HYDROLOGY & WATER RESOURCES SURVEY BUREAU OF YELLOW RIVER WATER RESOURCES COMMISSION
Filing Date
2026-03-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing automated methods for verifying logical errors in hydrological data compilation are ill-suited to the complex correlation features and diverse entity descriptions of hydrological data. They fail to capture the multidimensional and complex relationships between hydrological entities globally and do not effectively handle entity ambiguities caused by differences in data descriptions, resulting in inaccurate verification results.

Method used

A hydrological knowledge graph pattern is constructed, semantically consistent knowledge graph instances are generated through entity alignment, logical verification items are transformed into graph pattern matching conditions, the rule set is optimized, and graph traversal verification is performed to identify logical error candidates.

🎯Benefits of technology

It enables global and correlated modeling of hydrological data, accurately captures multidimensional and complex correlations, improves the accuracy and efficiency of verification results, and ensures data quality.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240691A_ABST
    Figure CN122240691A_ABST
Patent Text Reader

Abstract

This invention discloses an automatic logical error verification method for hydrological compilation results based on knowledge graphs, belonging to the field of hydrological data processing technology. The method first acquires the hydrological compilation data to be verified, analyzes hydrological elements and their relationships to construct a hydrological knowledge graph pattern, and generates semantically consistent hydrological knowledge graph instances through entity alignment processing. Then, it determines logical verification items based on hydrological physical laws and transforms them into graph pattern matching conditions, generating a rule set, and optimizing the rules through conflict detection and priority sorting. Finally, based on the optimized rule set, it performs graph traversal on the graph instances, searches for graph paths that violate the matching conditions to obtain logical error candidates, and outputs the error candidates and their corresponding graph context information. This invention achieves global relational modeling of hydrological data, effectively eliminates semantic ambiguity of entities, accurately captures complex relationships between hydrological elements, and improves the accuracy and comprehensiveness of logical error identification in hydrological compilation results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of hydrological data processing technology, and in particular to an automatic method for verifying logical errors in hydrological compilation results based on knowledge graphs. Background Technology

[0002] Hydrological data compilation is the core data support for hydrological operations such as hydrological monitoring, water resources management, and water conservancy project planning. Its logical accuracy directly affects the scientific nature of hydrological analysis and decision-making. Therefore, automated verification of logical errors in hydrological data compilation has become a key technical aspect in the field of hydrological data processing. With the improvement of hydrological monitoring networks, the types of hydrological stations and elements covered by hydrological data compilation have increased significantly. The relationships between upstream and downstream hydrological stations and between functional dependencies of hydrological elements have become more multidimensional and complex. Furthermore, hydrological entities from different data sources often have inconsistent naming and description formats, which places higher demands on the association modeling and unified semantic processing capabilities of automated verification technology.

[0003] Existing automated methods for verifying logical errors in hydrological compilation results are ill-suited to the complex correlation characteristics and diverse entity descriptions of hydrological data. The core problem lies in their lack of a systematic approach to constructing comprehensive relationships between hydrological elements and stations. They only set verification logic for isolated hydrological indicators or single correlation types, failing to capture multi-type relationships such as upstream-downstream, observation, and functional dependencies between hydrological entities at a global level. Furthermore, existing methods do not specifically address ambiguities in hydrological entities caused by differences in data descriptions, making it difficult to effectively distinguish whether hydrological entities under different descriptions belong to the same object. This leads to correlation biases in the verification logic when matching hydrological data, ultimately making it difficult for existing automated verification methods to comprehensively and accurately identify hidden logical errors in the compilation results, thus failing to effectively guarantee the data quality of hydrological compilation results. Summary of the Invention

[0004] This invention provides an automatic logical error verification method for hydrological compilation results based on knowledge graphs to solve the problems mentioned in the background art.

[0005] To achieve the above objectives, the present invention provides an automatic logical error detection method for hydrological compilation results based on knowledge graphs, comprising:

[0006] Obtain the hydrological data to be verified;

[0007] Analyze the hydrological elements contained in the hydrological compilation data and the relationships between them, and construct a hydrological knowledge graph model;

[0008] Based on the hydrological knowledge graph model, entity instances and relation instances are extracted from the hydrological compilation data, and entity alignment is performed to handle entity ambiguities caused by differences in data description during the extraction process, generating semantically consistent hydrological knowledge graph instances.

[0009] Based on hydrophysical laws and statistical consistency requirements, logical verification items are determined and transformed into graph pattern matching conditions to generate a set of logical verification rules.

[0010] The set of logical verification rules is subjected to rule conflict detection, and the rules are prioritized based on the detection results to obtain an optimized set of logical verification rules.

[0011] Based on the optimized set of logical verification rules, the hydrological knowledge graph instances are traversed to find graph paths that violate the graph pattern matching conditions, and logical error candidates are obtained.

[0012] Output logical error candidates and their corresponding graph context information.

[0013] Preferably, the step of performing entity alignment processing on entity ambiguities caused by differences in data description during the extraction process to generate semantically consistent hydrological knowledge graph instances includes:

[0014] Identify candidate entity pairs by identifying hydrological entities in the compiled hydrological data that have the same or similar names but may point to different hydrological entities.

[0015] Determine the similarity of each candidate entity to its associated time series data in the time dimension, and the proximity of its associated spatial location data in the spatial dimension;

[0016] Based on temporal similarity and spatial proximity, it is determined whether candidate entity pairs point to the same hydrological entity. Entity instances that are determined to point to the same hydrological entity are merged to obtain semantically consistent hydrological knowledge graph instances.

[0017] Preferably, determining the similarity of each candidate entity to the associated time-series data in the time dimension, and the proximity of the associated spatial location data in the spatial dimension, includes:

[0018] Extract the water level or flow time series corresponding to each entity instance in the candidate entity pair, and generate a time similarity metric by analyzing the changing trend and fluctuation characteristics of the series.

[0019] Extract the geographic coordinates or relative location description of each entity instance, and generate a spatial proximity metric by measuring the Euclidean distance or topological relationship between the coordinates;

[0020] The temporal similarity metric and spatial proximity metric are weighted and combined. When the combined metric meets the preset identity threshold condition, the candidate entity pair is determined to point to the same hydrological entity.

[0021] Preferably, the step of performing rule conflict detection on the logical verification rule set includes:

[0022] For each logical verification rule, perform graph pattern matching condition parsing to extract the trigger graph pattern and consequent constraint expression of the rule;

[0023] Compare the trigger graph patterns and consequent constraint expressions of different rules to identify rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy relationships.

[0024] Preferably, the priority ranking based on the detection results includes:

[0025] For rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy, the rules to be retained are determined based on the universality of the hydrophysical laws upon which the rules are based or the accuracy of historical verification.

[0026] Each retained logical verification rule is assigned a priority weight, which is determined based on the strength of the hydrophysical laws involved in the rule or the confidence level of the statistical consistency constraints.

[0027] The retained logic verification rules are sorted in descending order of priority weight to generate an ordered rule list, which is then used as the optimized logic verification rule set.

[0028] Preferably, the step of performing graph traversal on hydrological knowledge graph instances based on the optimized set of logical verification rules to find graph paths that violate graph pattern matching conditions and obtain logical error candidates includes:

[0029] Starting with each entity instance, the graph is traversed sequentially by applying each logical validation rule in the order of the optimized logical validation rule set.

[0030] For the current application's logical verification rules, perform a breadth-first or depth-first traversal along the graph according to the relationship type specified in the rule to generate a set of candidate paths;

[0031] Each candidate path is matched with the trigger graph pattern of the current logical verification rule. If the match is successful, the attribute values ​​of each entity instance on the candidate path are extracted.

[0032] The extracted attribute values ​​are compared with the consequent constraint expression of the current logical verification rule to determine whether they meet the numerical range or logical relationship in the consequent constraint expression.

[0033] If the attribute value does not satisfy the consequent constraint expression, then the candidate path and the entity instance involved are marked as logical error candidates that violate the current logical verification rules.

[0034] Preferably, the graph traversal further includes:

[0035] Based on the intermediate constraints in the current logical verification rules, prune candidate paths that do not meet the intermediate constraints during the traversal process;

[0036] For candidate paths that meet the triggering conditions of the current logical verification rule, extract the attribute values ​​of each entity instance on the candidate path and compare them with the expected attribute value range or logical relationship in the consequent constraint expression of the current rule.

[0037] Record all candidate paths that do not satisfy the consequent constraint expression, and save the identifier and attribute value of each entity instance on the candidate path as part of the graph context information of the logical error candidate.

[0038] Preferably, the step of converting the logical check item into a graph pattern matching condition includes:

[0039] Analyze the hydrological entity types and relationship types involved in the logical verification items to determine the node templates and edge templates in the graph pattern;

[0040] Add attribute constraints to the node template. The attribute constraints correspond to the numerical comparison relationships, timing relationships, or logical operation relationships in the logical verification items.

[0041] Combine node templates and edge templates into a graph pattern, and set this graph pattern as the matching condition for logical validation items.

[0042] Preferably, the output logic error candidates and their corresponding graph context information include:

[0043] Extract all entity instances and their attribute values ​​on the graph path involved in each logical error candidate to generate a context dataset of logical error candidates;

[0044] A visual error report is generated based on the graph path. The visual error report graphically displays the relationships between entity instances, the locations of violations of logical verification rules, and related attribute values.

[0045] Establish an association index between the visualized error report and the corresponding data item in the original hydrological compilation data, and store it in the verification result database.

[0046] Preferably, the analysis of hydrological elements contained in the hydrological compilation data and the relationships between these elements, and the construction of a hydrological knowledge graph model, includes:

[0047] Identify hydrological stations, hydrological element types, and time sections from the compiled hydrological data as candidate entity types;

[0048] Based on the metadata information or the correlation between data records in the hydrological compilation results, the upstream and downstream relationships between hydrological stations, the observation relationships between hydrological stations and hydrological elements, and the functional dependencies between hydrological elements are determined as candidate relationship types.

[0049] Based on candidate entity types and candidate relation types, the attribute sets of each candidate entity type and the attribute constraints of each candidate relation type are defined to form a hydrological knowledge graph model.

[0050] Compared with the prior art, the present invention has the following beneficial effects:

[0051] 1. By constructing a hydrological knowledge graph pattern and aligning entity ambiguities, semantically consistent hydrological knowledge graph instances are generated. At the same time, logical verification items are transformed into graph pattern matching conditions, and graph traversal verification is carried out after optimizing the rules. This achieves global and relational modeling of hydrological compilation data, which can accurately capture the multi-dimensional and complex relationships between hydrological stations and hydrological elements, effectively eliminate entity semantic ambiguities caused by differences in data description, and ensure that the verification work can cover all kinds of inherent logical relationships between hydrological data.

[0052] 2. By combining the temporal and spatial dimensions of entity alignment with intermediate constraint pruning and precise attribute value comparison during graph traversal, a synergistic effect is achieved. This comprehensive approach improves the semantic consistency and entity matching accuracy of hydrological knowledge graph instances from the data source, laying a high-quality data foundation for subsequent verification. Meanwhile, the graph traversal verification reduces the generation of invalid candidate paths and achieves refined matching verification of rule constraints. The synergy of these two approaches further improves the efficiency of identifying logical error candidates and ensures the comprehensiveness of identifying various types of illegal graph paths, making the verification results more reliable. Attached Figure Description

[0053] Figure 1 This is a flowchart illustrating an automatic logical error verification method for hydrological data compilation based on knowledge graphs, provided in an embodiment of the present invention.

[0054] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0055] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0056] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.

[0057] Reference Figure 1 The diagram shown is a flowchart illustrating an automatic logical error detection method for hydrological compilation results based on knowledge graphs, provided in an embodiment of the present invention. In this embodiment, the automatic logical error detection method for hydrological compilation results based on knowledge graphs includes:

[0058] S100. Obtain the hydrological compilation data to be verified;

[0059] S200. Analyze the hydrological elements contained in the hydrological compilation data and the relationships between the hydrological elements, and construct a hydrological knowledge graph model;

[0060] S300. Based on the hydrological knowledge graph model, entity instances and relation instances are extracted from the hydrological compilation results data, and entity alignment processing is performed on entity ambiguities caused by differences in data description during the extraction process to generate semantically consistent hydrological knowledge graph instances.

[0061] S400. Based on hydrophysical laws and statistical consistency requirements, determine the logical verification items, convert the logical verification items into graph pattern matching conditions, and generate a set of logical verification rules;

[0062] S500. Perform rule conflict detection on the set of logical verification rules, and sort them by priority based on the detection results to obtain an optimized set of logical verification rules;

[0063] S600. Based on the optimized set of logical verification rules, perform graph traversal on the hydrological knowledge graph instance, search for graph paths that violate graph pattern matching conditions, and obtain logical error candidates;

[0064] S700. Output logical error candidates and their corresponding graph context information.

[0065] In S200 of this invention, the hydrological elements contained in the hydrological compilation data and the relationships between these elements are analyzed to construct a hydrological knowledge graph model, including:

[0066] Identify hydrological stations, hydrological element types, and time sections from the compiled hydrological data as candidate entity types;

[0067] Based on the metadata information or the correlation between data records in the hydrological compilation results, the upstream and downstream relationships between hydrological stations, the observation relationships between hydrological stations and hydrological elements, and the functional dependencies between hydrological elements are determined as candidate relationship types.

[0068] Based on candidate entity types and candidate relation types, the attribute sets of each candidate entity type and the attribute constraints of each candidate relation type are defined to form a hydrological knowledge graph model.

[0069] This embodiment addresses the technical problem in existing technologies where hydrological data exists in scattered tables or documents, lacking semantic relationships, making it difficult to perform implicit logic verification across data items. By constructing a knowledge graph model oriented towards the hydrological domain, discrete data is transformed into a semantically related graph structure, providing a unified and computable semantic framework for subsequent automated verification, thus realizing the transformation from isolated data to structured knowledge.

[0070] Specifically, candidate entity types are identified from the hydrological compilation data of a tributary in the Yangtze River Basin. For example, by reviewing the hydrological compilation records of the tributary line by line, three specific hydrological stations—upstream, midstream, and downstream—are selected. Two hydrological element types—water level and flow rate—are identified, and two time sections—the end of each month and the end of the flood season each year—are extracted. These selected hydrological stations, hydrological element types, and time sections are then directly used as candidate entity types.

[0071] Furthermore, candidate relationship types are determined based on metadata information in the compiled hydrological data of this tributary.

[0072] For example, the metadata clearly indicates that the upstream station is located upstream of the midstream hydrological station and the midstream hydrological station is located upstream of the downstream station, thereby determining the upstream and downstream relationship between hydrological stations; the metadata shows that each hydrological station has corresponding water level and flow observation records, thereby determining the observation relationship between hydrological stations and hydrological elements; by analyzing the data records, it is found that the flow increases when the water level rises, thereby determining the functional dependency relationship between hydrological elements, and these three relationships are used as candidate relationship types.

[0073] Finally, based on the determined candidate entity types and candidate relation types, the attribute set for each candidate entity type and the attribute constraints for each candidate relation type are defined.

[0074] For example, we can define attribute sets such as geographic coordinates and station establishment time for hydrological stations, attribute sets such as observation units and recording accuracy for hydrological element types, attribute constraints for unidirectional associations for upstream and downstream relationships, and attribute constraints for observation relationships such as each station observing at least one hydrological element. By integrating these attribute sets and attribute constraints, we can form a hydrological knowledge graph model.

[0075] This approach first refines the data to accurately capture entity and relationship types unique to the hydrological domain, avoiding overly generalized pattern definitions. This improves the accuracy of subsequent instance extraction and ensures that the knowledge graph truly reflects the topological structure and physical laws of the hydrological system.

[0076] In step S300 of this invention, entity alignment processing is performed on entity ambiguities caused by differences in data description during the extraction process to generate semantically consistent hydrological knowledge graph instances, including:

[0077] Identify candidate entity pairs by identifying hydrological entities in the compiled hydrological data that have the same or similar names but may point to different hydrological entities.

[0078] Determine the similarity of each candidate entity to its associated time series data in the time dimension, and the proximity of its associated spatial location data in the spatial dimension;

[0079] Based on temporal similarity and spatial proximity, it is determined whether candidate entity pairs point to the same hydrological entity. Entity instances that are determined to point to the same hydrological entity are merged to obtain semantically consistent hydrological knowledge graph instances.

[0080] In S300 of this invention, determining the similarity of each candidate entity to its associated time-series data in the time dimension, and the proximity of its associated spatial location data in the spatial dimension, includes:

[0081] Extract the water level or flow time series corresponding to each entity instance in the candidate entity pair, and generate a time similarity metric by analyzing the changing trend and fluctuation characteristics of the series.

[0082] Extract the geographic coordinates or relative location description of each entity instance, and generate a spatial proximity metric by measuring the Euclidean distance or topological relationship between the coordinates;

[0083] The temporal similarity metric and spatial proximity metric are weighted and combined. When the combined metric meets the preset identity threshold condition, the candidate entity pair is determined to point to the same hydrological entity.

[0084] The formula for calculating the overall similarity is as follows:

[0085]

[0086] In the formula, This represents two entity instances in a candidate entity pair. Representing entities and The overall similarity score, with a value range of 100%. The larger the value, the higher the probability that it points to the same hydrological entity.

[0087] The temporal similarity based on time series data can be calculated using dynamic time warping or Pearson correlation coefficient, and normalized to... , Corresponding to The associated water level or flow rate time series, Spatial proximity based on location can be calculated using the inverse function of Euclidean distance or the Gaussian kernel function, and normalized to... The smaller the distance, the higher the proximity. Corresponding to The geographic coordinates or relative position describe the transformed coordinates. Represents the weighting coefficients, satisfying It can be adjusted according to the actual application scenario. For example, when the time series quality is high, increase .

[0088] It should be noted that historical data may contain issues such as non-standard naming and differences in coordinate descriptions, which may result in the same entity being represented in multiple forms, easily leading to redundant entities and erroneous logical judgments in the knowledge graph.

[0089] This solution introduces an entity alignment strategy based on temporal features and spatial proximity to eliminate ambiguity caused by differences in data descriptions, generating semantically consistent and non-redundant hydrological knowledge graph instances. This provides a clean and reliable data foundation for subsequent rule verification, significantly improving graph quality and verification accuracy.

[0090] Specifically, from the hydrological data of a tributary of the Yangtze River to be verified, six entities labeled as upstream hydrological station, upstream station, midstream hydrological station, midstream station, downstream hydrological station, and downstream station can be identified. Among them, upstream hydrological stations are similar in name to upstream stations, midstream hydrological stations are similar in name to midstream stations, and downstream hydrological stations are similar in name to downstream stations. These entities with similar names are paired up to obtain three sets of candidate entity pairs: upstream hydrological station and upstream station, midstream hydrological station and midstream station, and downstream hydrological station and downstream station.

[0091] Furthermore, from the hydrological data of a tributary of the Yangtze River to be verified, six entities were identified as upstream hydrological stations, upstream stations, midstream hydrological stations, midstream stations, downstream hydrological stations, and downstream stations. Among them, upstream hydrological stations and upstream stations have similar names, midstream hydrological stations and midstream stations have similar names, and downstream hydrological stations and downstream stations have similar names. These entities with similar names were paired up to obtain three sets of candidate entity pairs: upstream hydrological stations and upstream stations, midstream hydrological stations and midstream stations, and downstream hydrological stations and downstream stations.

[0092] Then, for each candidate entity pair, the water level time series associated with each entity instance is extracted. For example, the water level time series of the upstream hydrological station records the monthly water level changes over the past ten years, and the water level time series of the upstream station also records the monthly water level changes over the past ten years.

[0093] Next, we analyze whether the changing trends of these two time series are consistent. For example, both show rising water levels from June to August each year and falling water levels from September to May of the following year. Their fluctuation characteristics are that they peak in summer and remain low in winter. Based on this, we generate a time similarity metric, which reflects the degree of fit between the two time series in terms of changing trends and fluctuation characteristics. The closer the value is to one, the more consistent the changing characteristics of the two series are.

[0094] At the same time, the geographic coordinates of each entity instance are extracted. The geographic coordinates of the upstream hydrological station are 116°27′E, 30°15′N, and the geographic coordinates of the upstream station are 116°28′E, 30°16′N. The straight-line distance between the two geographic coordinates is determined, and a spatial proximity metric is generated based on the distance. The smaller the distance, the closer the metric is to one, and the larger the distance, the closer the metric is to zero.

[0095] If the tributary's flood season water level data shows a stable fluctuation pattern and high time series quality, then the weight coefficient corresponding to time similarity can be preset to 0.6, and the weight coefficient corresponding to spatial proximity can be preset to 0.4. The two weight coefficients are added together to one. The larger the weight coefficient of time similarity, the stronger the role of time dimension data in the judgment. The larger the weight coefficient of spatial proximity, the stronger the role of spatial dimension data in the judgment.

[0096] Furthermore, the temporal similarity metric of each candidate entity pair is multiplied by the temporal similarity weight coefficient, and the spatial proximity metric is multiplied by the spatial proximity weight coefficient. The two multiplication results are then added together to obtain the comprehensive similarity. The higher the comprehensive similarity, the higher the probability that the two entity instances point to the same hydrological entity.

[0097] It should be noted that when setting the identity threshold, the comprehensive similarity of the same hydrological entity pairs that have been manually confirmed and merged in the past five years of the tributary can be statistically analyzed, and the minimum value among them can be taken as the threshold. For example, if the minimum value is 0.8, then this value can be preset as the identity threshold. If the comprehensive similarity is greater than or equal to the threshold, it is determined to be the same hydrological entity, and if it is less than the threshold, it is determined to be different hydrological entities.

[0098] If the combined measurement values ​​of the upstream hydrological station and the upstream station reach the preset threshold, then the two entity instances are merged into one upstream station entity instance. If the combined measurement values ​​of the midstream hydrological station and the midstream station do not reach the preset threshold, then they are determined to be two different entities, and their respective identifiers are retained. If the combined measurement values ​​of the downstream hydrological station and the downstream station reach the preset threshold, then the two entity instances are merged into one downstream station entity instance.

[0099] The final result is a semantically consistent hydrological knowledge graph instance, which contains four entity instances: upstream station, midstream hydrological station, midstream station, and downstream station, as well as the upstream and downstream relationships and other related information between the entity instances.

[0100] In S400 of this invention, converting logical check items into graph pattern matching conditions includes:

[0101] Analyze the hydrological entity types and relationship types involved in the logical verification items to determine the node templates and edge templates in the graph pattern;

[0102] Add attribute constraints to the node template. The attribute constraints correspond to the numerical comparison relationships, timing relationships, or logical operation relationships in the logical verification items.

[0103] Combine node templates and edge templates into a graph pattern, and set this graph pattern as the matching condition for logical validation items.

[0104] In this embodiment, by transforming the verification items described in natural language into structured graph pattern matching conditions, an effective bridge between domain knowledge and graph computing technology is achieved, enabling the verification rules to be directly queried on the graph database, thus providing an executable rule foundation for automated logical verification.

[0105] Specifically, the hydrological entity types and relationship types involved in the logical verification items are analyzed. For example, if the logical verification item "the flow in the same section does not increase along the course" is selected for a tributary of the Yangtze River, the analysis shows that the hydrological entity type involved in this verification item is the hydrological station of the tributary, and the relationship type involved is the flow connection relationship between the hydrological stations in the same section. Based on this, the node template in the graph pattern is determined to be the hydrological station of the tributary, and the edge template is the flow connection in the same section.

[0106] Then, attribute constraints are added to the determined node templates. These attribute constraints correspond to the numerical comparison relationships in the logical verification items. For example, for the verification requirement that "the flow rate does not increase along the same river section", flow attribute constraints are added to each hydrological station node template to clarify that the flow rate value of the next station along the water flow direction must not be greater than the flow rate value of the previous station.

[0107] Finally, the node templates and edge templates are combined into a graph pattern. For example, two hydrological station nodes are sequentially associated through the edge template of the same river segment flow connection. At the same time, additional flow attribute constraints are incorporated to form a complete graph pattern. If the graph pattern can accurately correspond to the verification requirement of "the flow in the same river segment does not increase along the way", then the graph pattern is set as the matching condition for the logical verification item.

[0108] In S500 of this invention, rule conflict detection is performed on the logical verification rule set, including:

[0109] For each logical verification rule, perform graph pattern matching condition parsing to extract the trigger graph pattern and consequent constraint expression of the rule;

[0110] Compare the trigger graph patterns and consequent constraint expressions of different rules to identify rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy relationships.

[0111] In S500 of this invention, priority ranking based on detection results includes:

[0112] For rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy, the rules to be retained are determined based on the universality of the hydrophysical laws upon which the rules are based or the accuracy of historical verification.

[0113] Each retained logical verification rule is assigned a priority weight, which is determined based on the strength of the hydrophysical laws involved in the rule or the confidence level of the statistical consistency constraints.

[0114] The retained logic verification rules are sorted in descending order of priority weight to generate an ordered rule list, which is then used as the optimized logic verification rule set.

[0115] As a preferred implementation, this solution identifies inclusion, mutual exclusion, or redundancy relationships between rules through rule conflict detection, and then prioritizes them based on the universality of physical laws and confidence levels. This eliminates logical conflicts between rules, determines the optimal execution order, and significantly improves the robustness of the rule set and the reliability of the verification results.

[0116] Specifically, graph pattern matching condition parsing is performed on each logical verification rule in the logical verification rule set.

[0117] For example, when parsing the rule that upstream and downstream water levels must not be reversed, the trigger graph pattern of the rule is that there is an upstream and downstream water flow connection between hydrological stations, and the consequent constraint expression of the rule is that the water level value of the upstream hydrological station is less than the water level value of the downstream hydrological station. When parsing the rule that the flow rate decreases along the same river section, the trigger graph pattern of the rule is that multiple hydrological stations belong to the same river channel, and the consequent constraint expression of the rule is that the flow rate value of the later station is less than the flow rate value of the earlier station along the water flow direction.

[0118] Furthermore, the trigger graph patterns and consequent constraint expressions of different logical verification rules are compared one by one. For example, the upstream and downstream water level constraint rules are compared with the flow constraint rules of the same river segment. The trigger graph patterns of both point to the upstream and downstream association of the river hydrological station, and the consequent constraints are both numerically decreasing along the direction of water flow. This indicates that there is a logical redundancy relationship between the rule pairs. Then, the rule that the water level increases with the increase of flow is compared with the rule that the water level decreases with the increase of flow. The consequent constraints of the two are completely opposite. This indicates that there is a logical mutual exclusion relationship between the rule pairs.

[0119] Then, for the rule pairs that are identified as having logical inclusion, mutual exclusion, or redundancy relationships, the rules to be retained are selected based on the degree of universality of the hydrophysical laws upon which the rules are based.

[0120] It should be noted that the universality of hydrophysical laws can be determined by judging the coverage of the hydrological scenarios to which the laws apply. For example, if a certain hydrophysical law can be applied to all natural and artificial rivers, without being limited by region, season, or the width and depth of the river, it is judged to have a high degree of universality. If it can only be applied to specific types of rivers or fixed hydrological cycles, it is judged to have a medium degree of universality. If it can only be applied to a small local hydrological area, it is judged to have a low degree of universality.

[0121] For example, in logically redundant water level and flow constraint rules, the gravity flow law followed by upstream and downstream water levels is applicable to all hydrological stations with flow direction and is not affected by river basin and seasonal changes, so it is judged to have a high degree of universality. However, the law of decreasing flow along the same river section is only applicable to a single river section without tributaries, so it is judged to have a medium degree of universality. Rules with high universality are preferred to be retained, so the upstream and downstream water level constraint rules are retained. In logically mutually exclusive water level and flow rules, the water level increases with the increase of flow, which conforms to the real hydrological physical law and is applicable to the vast majority of natural rivers. It has high universality, so this rule is retained.

[0122] Next, priority weights are assigned to each retained logical verification rule. The priority weights are determined based on the strength of the hydrological and physical laws corresponding to the rule. For example, the upstream and downstream water level constraint rule relies on the gravity flow law with the highest strength, so it is given the highest priority weight. The water level and flow rate association rule relies on the statistical law with the second highest strength, so it is given the next highest priority weight.

[0123] Finally, all retained logic verification rules are arranged in descending order of priority weight, forming an ordered list of rules with a fixed order. This ordered list of rules is directly used as the optimized set of logic verification rules.

[0124] In S600 of this invention, a graph traversal is performed on hydrological knowledge graph instances based on the optimized set of logical verification rules to find graph paths that violate graph pattern matching conditions, thereby obtaining logical error candidates, including:

[0125] Starting with each entity instance, the graph is traversed sequentially by applying each logical validation rule in the order of the optimized logical validation rule set.

[0126] For the current application's logical verification rules, perform a breadth-first or depth-first traversal along the graph according to the relationship type specified in the rule to generate a set of candidate paths;

[0127] Each candidate path is matched with the trigger graph pattern of the current logical verification rule. If the match is successful, the attribute values ​​of each entity instance on the candidate path are extracted.

[0128] The extracted attribute values ​​are compared with the consequent constraint expression of the current logical verification rule to determine whether they meet the numerical range or logical relationship in the consequent constraint expression.

[0129] If the attribute value does not satisfy the consequent constraint expression, then the candidate path and the entity instance involved are marked as logical error candidates that violate the current logical verification rules.

[0130] In S600 of the present invention, graph traversal further includes:

[0131] Based on the intermediate constraints in the current logical verification rules, prune candidate paths that do not meet the intermediate constraints during the traversal process;

[0132] For candidate paths that meet the triggering conditions of the current logical verification rule, extract the attribute values ​​of each entity instance on the candidate path and compare them with the expected attribute value range or logical relationship in the consequent constraint expression of the current rule.

[0133] Record all candidate paths that do not satisfy the consequent constraint expression, and save the identifier and attribute value of each entity instance on the candidate path as part of the graph context information of the logical error candidate.

[0134] In a preferred embodiment of the present invention, an optimized set of rules is applied sequentially on a semantically consistent knowledge graph using graph traversal technology. This enables the discovery of systematic logical errors across multiple nodes and relationships, and across different sites and elements. At the same time, traversal efficiency is improved and the traceability of error results is ensured through intermediate constraint pruning and context recording.

[0135] Specifically, taking the four entity instances of the upstream station, midstream hydrological station, midstream station and downstream station in the hydrological knowledge graph instance of a tributary of the Yangtze River as the starting point, the graph traversal is carried out by applying each logical verification rule in the predetermined order of the optimized logical verification rule set.

[0136] Furthermore, for the first logical verification rule of the current application, according to the upstream and downstream water flow connection relationship between hydrological stations specified by the rule, a breadth-first traversal is adopted along the hydrological knowledge graph instance to sequentially visit all directly and indirectly related entities starting from the starting entity.

[0137] For example, starting from an upstream station, the system first visits the directly associated midstream hydrological station, then visits the downstream station associated with the midstream hydrological station, and then directly visits the associated downstream station from the midstream station. This generates four paths: upstream station-midstream hydrological station, upstream station-midstream hydrological station-downstream station, midstream hydrological station-downstream station, and midstream station-downstream station. These paths are then combined to form a candidate path set.

[0138] Meanwhile, the current application's logical verification rules set the intermediate constraint condition as all entity instances on the path belonging to that branch. Then, during the traversal, if a path containing other branch sites is found, and this path does not meet the intermediate constraint condition, then the path is removed, thus completing the pruning of candidate paths that do not meet the intermediate constraint condition.

[0139] Furthermore, each remaining candidate path after pruning is matched with the trigger graph pattern of the current logical verification rule. If the candidate path matches the trigger graph pattern of the rule, the water level attribute value of each entity instance on the candidate path is extracted. For example, for the upstream station-midstream hydrological station path, the water level of the upstream station is extracted as 2 meters and the water level of the midstream hydrological station is extracted as 2.5 meters.

[0140] Then, the extracted water level attribute values ​​of each entity instance are compared with the consequent constraint expression of the current rule, which requires that the water level value of the upstream station along the water flow direction be less than the water level value of the downstream station. The water level relationship of each path is verified one by one.

[0141] If the water level at the upstream hydrological station is 2.5 meters and the water level at the downstream station is 2.3 meters on a candidate path, which does not satisfy the logical relationship in the consequent constraint expression, then the candidate path and the entity instances of the upstream and downstream hydrological stations involved in the path are marked as logical error candidates that violate the current logical verification rules.

[0142] Finally, record all candidate paths that do not satisfy the consequent constraint expression, and save the unique identifier of each entity instance on each candidate path and the corresponding water level attribute value. For example, save the identifiers of the midstream hydrological station and the downstream station and their respective water level values. This content is part of the graph context information of the logical error candidate.

[0143] In S700 of this invention, the output of logical error candidates and their corresponding graph context information includes:

[0144] Extract all entity instances and their attribute values ​​on the graph path involved in each logical error candidate to generate a context dataset of logical error candidates;

[0145] A visual error report is generated based on the graph path. The visual error report graphically displays the relationships between entity instances, the locations of violations of logical verification rules, and related attribute values.

[0146] Establish an association index between the visualized error report and the corresponding data item in the original hydrological compilation data, and store it in the verification result database.

[0147] In this embodiment, by extracting complete contextual information of the graph and generating a visual error report, the error association and the location of rule violations are displayed intuitively. At the same time, an association index is established with the original data, providing clear clues for error cause analysis and data correction, which greatly facilitates users' understanding and location of errors.

[0148] Specifically, extract all entity instances and their attribute values ​​on the graph path involved in each logical error candidate. For example, if the graph path corresponding to a logical error candidate is upstream station-midstream station, extract the two entity instances of upstream station and midstream station on this path, extract the attribute values ​​of upstream station flow rate of 30 cubic meters per second and midstream station flow rate of 35 cubic meters per second, integrate these entity instances and corresponding attribute values ​​to generate the context dataset of the logical error candidate.

[0149] Then, a visual error report is generated based on the graph path. For example, for the graph path of upstream station to midstream station, the two entities of upstream station and midstream station are connected by a straight line, and the connection relationship of the water flow in the same river section is marked. The location where the flow of midstream station is greater than that of upstream station is marked in red, and the flow attribute value of each entity is marked next to the two entities respectively. The relationship between entity instances, the location of the violation of logical verification rules and related attribute values ​​are clearly displayed in a graphical way.

[0150] Finally, an association index is established between the visualized error report and the corresponding data items in the original hydrological compilation data. For example, the flow attribute values ​​of the upstream and midstream stations in the visualized report are associated with the corresponding flow record entries of the upstream and midstream stations in the original hydrological compilation data to clarify the correspondence between the two. Then, the visualized error report and the association index are stored together in the verification result database for easy subsequent query and traceability.

[0151] In the several embodiments provided by this invention, it should be understood that the disclosed method can be implemented in other ways.

[0152] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the present invention can be implemented in other specific forms without departing from the spirit or essential characteristics of the present invention.

[0153] The embodiments of this application can acquire and process relevant data based on artificial intelligence technology. Artificial intelligence is the theory, method, and technology that uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results.

[0154] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for automatically checking logical errors of hydrological compilation results based on a knowledge graph, characterized in that, The method includes the following steps: Obtain the hydrological data to be verified; Analyze the hydrological elements contained in the data of hydrological compilation and the relationships between hydrological elements, and construct a hydrological knowledge graph model; Based on the hydrological knowledge graph model, entity instances and relation instances are extracted from the hydrological compilation data, and entity alignment is performed to handle entity ambiguities caused by differences in data description during the extraction process, generating semantically consistent hydrological knowledge graph instances. Based on hydrophysical laws and statistical consistency requirements, logical verification items are determined and transformed into graph pattern matching conditions to generate a set of logical verification rules. The set of logical verification rules is subjected to rule conflict detection, and the rules are prioritized based on the detection results to obtain an optimized set of logical verification rules. Based on the optimized set of logical verification rules, the hydrological knowledge graph instances are traversed to find graph paths that violate the graph pattern matching conditions, and logical error candidates are obtained. Output logical error candidates and their corresponding graph context information.

2. The knowledge graph-based hydrological integrated compilation result logical error automatic checking method according to claim 1, characterized in that, The process involves entity alignment to address entity ambiguities caused by differences in data description during extraction, generating semantically consistent hydrological knowledge graph instances, including: Identify candidate entity pairs by identifying hydrological entities in the compiled hydrological data that have the same or similar names but may point to different hydrological entities. Determine the similarity of each candidate entity to its associated time series data in the time dimension, and the proximity of its associated spatial location data in the spatial dimension; Based on temporal similarity and spatial proximity, it is determined whether candidate entity pairs point to the same hydrological entity. Entity instances that are determined to point to the same hydrological entity are merged to obtain semantically consistent hydrological knowledge graph instances.

3. The knowledge graph-based hydrological integrated compilation result logical error automatic checking method according to claim 2, characterized in that, Determining the similarity of each candidate entity to its associated time-series data in the time dimension, and the proximity of its associated spatial location data in the spatial dimension, includes: Extract the water level or flow time series corresponding to each entity instance in the candidate entity pair, and generate a time similarity metric by analyzing the changing trend and fluctuation characteristics of the series. Extract the geographic coordinates or relative location description of each entity instance, and generate a spatial proximity metric by measuring the Euclidean distance or topological relationship between the coordinates; The temporal similarity metric and spatial proximity metric are weighted and combined. When the combined metric meets the preset identity threshold condition, the candidate entity pair is determined to point to the same hydrological entity.

4. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 1, characterized in that, The rule conflict detection of the logical verification rule set includes: For each logical verification rule, perform graph pattern matching condition parsing to extract the trigger graph pattern and consequent constraint expression of the rule; Compare the trigger graph patterns and consequent constraint expressions of different rules to identify rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy relationships.

5. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 4, characterized in that, The priority ranking based on detection results includes: For rule pairs that have logical inclusion, logical mutual exclusion, or logical redundancy, the rules to be retained are determined based on the universality of the hydrophysical laws upon which the rules are based or the accuracy of historical verification. Each retained logical verification rule is assigned a priority weight, which is determined based on the strength of the hydrophysical laws involved in the rule or the confidence level of the statistical consistency constraints. The retained logic verification rules are sorted in descending order of priority weight to generate an ordered rule list, which is then used as the optimized logic verification rule set.

6. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 1, characterized in that, The optimized set of logical verification rules is used to perform graph traversal on hydrological knowledge graph instances to find graph paths that violate graph pattern matching conditions, thereby obtaining logical error candidates, including: Starting with each entity instance, the graph is traversed sequentially by applying each logical validation rule in the order of the optimized logical validation rule set. For the current application's logical verification rules, perform a breadth-first or depth-first traversal along the graph according to the relationship type specified in the rule to generate a set of candidate paths; Each candidate path is matched with the trigger graph pattern of the current logical verification rule. If the match is successful, the attribute values ​​of each entity instance on the candidate path are extracted. The extracted attribute values ​​are compared with the consequent constraint expression of the current logical verification rule to determine whether they meet the numerical range or logical relationship in the consequent constraint expression. If the attribute value does not satisfy the consequent constraint expression, the candidate path and its associated entity instance are marked as logical error candidates that violate the current logical verification rules.

7. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 6, characterized in that, The graph traversal also includes: Based on the intermediate constraints in the current logical verification rules, prune candidate paths that do not meet the intermediate constraints during the traversal process; For candidate paths that meet the triggering conditions of the current logical verification rule, extract the attribute values ​​of each entity instance on the candidate path and compare them with the expected attribute value range or logical relationship in the consequent constraint expression of the current rule. Record all candidate paths that do not satisfy the consequent constraint expression, and save the identifier and attribute value of each entity instance on the candidate path as part of the graph context information of the logical error candidate.

8. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 1, characterized in that, The process of converting logical verification items into graph pattern matching conditions includes: Analyze the hydrological entity types and relationship types involved in the logical verification items to determine the node templates and edge templates in the graph pattern; Add attribute constraints to the node template. The attribute constraints correspond to the numerical comparison relationships, timing relationships, or logical operation relationships in the logical verification items. Combine node templates and edge templates into a graph pattern, and set this graph pattern as the matching condition for logical validation items. 9.The knowledge graph based hydrological compilation result logical error automatic checking method according to claim 1, wherein, The output logic error candidates and their corresponding graph context information include: Extract all entity instances and their attribute values ​​on the graph path involved in each logical error candidate to generate a context dataset of logical error candidates; A visual error report is generated based on the graph path. The visual error report graphically displays the relationships between entity instances, the locations of violations of logical verification rules, and related attribute values. Establish an association index between the visualized error report and the corresponding data item in the original hydrological compilation data, and store it in the verification result database.

10. The knowledge graph-based hydrological compilation result logical error automatic checking method according to claim 1, characterized in that, The analysis of hydrological data compilation results includes the hydrological elements and the relationships between them, constructing a hydrological knowledge graph model, including: Identify hydrological stations, hydrological element types, and time sections from the compiled hydrological data as candidate entity types; Based on the metadata information or the correlation between data records in the hydrological compilation results, the upstream and downstream relationships between hydrological stations, the observation relationships between hydrological stations and hydrological elements, and the functional dependencies between hydrological elements are determined as candidate relationship types. Based on candidate entity types and candidate relation types, the attribute sets of each candidate entity type and the attribute constraints of each candidate relation type are defined to form a hydrological knowledge graph model.