An automatic label generation method based on data change monitoring

By monitoring continuous changes in data objects to construct a change convergence fingerprint and introducing a counter-proof verification mechanism, the problem of label instability caused by changes in data objects in a short period of time is solved, thereby improving the stability and reliability of label generation.

CN122285784APending Publication Date: 2026-06-26广州美保科技有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
广州美保科技有限公司
Filing Date
2026-03-24
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing technologies fail to effectively handle continuous changes in data objects within a short period when generating data tags, leading to frequent tag fluctuations or the generation of incorrect tags, especially when the field information of the data object undergoes multiple modifications, manual corrections, source synchronization updates, or system corrections.

Method used

By continuously monitoring changes in target data objects, a change convergence fingerprint is constructed to identify stable semantic information. A counter-verification mechanism is introduced to avoid misjudgment of labels due to short-term changes or erroneous data.

Benefits of technology

It improves the stability and reliability of the tag generation results, ensuring that the tag generation process has the ability to continuously verify and adaptively adjust, and can generate long-term reliable tags in complex and dynamic data environments.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285784A_ABST
    Figure CN122285784A_ABST
Patent Text Reader

Abstract

This invention relates to the field of data tag generation technology, specifically to an automatic tag generation method based on data change monitoring, comprising the following steps: extracting change retention segments, change cancellation segments, and change suspension segments from the continuous change process of a target data object within a preset monitoring interval, and constructing corresponding change convergence fingerprints; performing tag solidification determination on the change convergence fingerprints, identifying stable semantic kernels that meet the solidification conditions, and generating corresponding tag units to be verified based on the stable semantic kernels; performing disprovenance resolution processing on the tag units to be verified, and determining whether it causes reverse destruction to the stable semantic kernels corresponding to the tag units to be verified. This invention can identify the truly stable parts of data semantics during the change process, without prematurely generating tags due to temporary modifications, misoperations, or short-term fluctuations, thereby significantly improving the stability and reliability of the tag generation results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data tag generation technology, and in particular to an automatic tag generation method based on data change monitoring. Background Technology

[0002] With the development of data governance and data asset management technologies, an increasing number of business systems are beginning to semantically label and structurally manage data by generating tags for data objects. Data tags can reflect the attribute characteristics, business categories, or behavioral states of data objects, playing a crucial role in scenarios such as data retrieval, data analysis, intelligent recommendation, and data governance. Therefore, how to accurately and efficiently generate tags automatically for data objects has become a key technical issue in data management systems.

[0003] In existing technologies, automatic label generation is typically implemented based on field value matching rules, statistical analysis results, or machine learning models. For example, when a field meets a preset condition, a corresponding label is immediately generated, or a label update is triggered directly based on the field's state after a data update. These methods are relatively simple to implement, but they often rely solely on the data state at a single moment or the result of a single change, without fully considering the continuous changes that may occur over a period of time. However, in real-world business environments, the field information of data objects often undergoes multiple modifications within a short period, such as data entry, manual correction, source synchronization updates, or system corrections. These changes may exhibit complex patterns in the time series, such as modification followed by retraction or replacement. If labels are generated solely based on a single change or the current instantaneous state, it can easily lead to frequent label fluctuations or even incorrect labels. For instance, if a field value is temporarily modified and then restored to its original value within a short period, and the system generates a label at the moment of modification, that label may not actually reflect the true stable attributes of the data object. Summary of the Invention

[0004] This invention provides an automatic tag generation method based on data change monitoring. In the context of continuous changes in data objects, it performs structured analysis on the semantic changes of fields to identify the truly stable semantic information. Furthermore, it introduces a counter-verification mechanism before tag generation to avoid misjudgment of tags due to short-term changes or erroneous data.

[0005] An automatic tag generation method based on data change monitoring includes the following steps:

[0006] S1, perform continuous change monitoring on the target data object. When field rewriting, field supplementation, field replacement, field withdrawal or field association migration is detected, instead of directly generating a label based on a single change result, the change retention segment, change cancellation segment and change suspension segment are extracted for the continuous change process of the target data object within the preset monitoring interval, and the corresponding change convergence fingerprint is constructed.

[0007] S2, perform label solidification determination on the change convergence fingerprint, segment the change retention segment, change cancellation segment and change suspension segment in the change convergence fingerprint, identify the stable semantic kernel that meets the solidification condition, and generate the corresponding label unit to be proved based on the stable semantic kernel; wherein, the label unit to be proved is used to characterize the candidate label result of the target data object that has the possibility of label generation in the current continuous change process, but still needs to exclude reverse evidence;

[0008] S3, perform counter-evidence resolution processing on the tag unit to be verified, retrieve the revocation changes, substitution changes, corrective changes or source overwriting changes that occur in the target data object during the subsequent listening phase, and determine whether they cause reverse destruction to the stable semantic kernel corresponding to the tag unit to be verified.

[0009] If no reverse destruction occurs, the label unit to be certified is solidified into the final label result and written into the label area of ​​the target data object;

[0010] If reverse destruction occurs, the solidified output of the tag unit to be certified is suppressed, and the change convergence fingerprint is updated.

[0011] Optionally, within the preset monitoring interval, the operation events of the target data object are recorded each time a field is rewritten, added, replaced, withdrawn, or migrated, forming a time-ordered sequence of change events. The operation events include at least the affected field identifier, the operation type, and snapshots of the field values ​​before and after the operation.

[0012] Optionally, the sequence of change events is traversed, and the change trajectory of the value of each field is tracked independently: if the final state of a field at the end of the sequence has undergone a net change compared to the beginning of the sequence and the net change is not subsequently reversed or overwritten, then the final semantics of the current field is marked as a change retention segment; if the change of a field that occurs in the middle of the sequence is subsequently completely offset, then the offset change semantics is marked as a change offset segment; if a field is still in an unconfirmed state at the end of the sequence, then the current intermediate semantics of the current field is marked as a change suspension segment.

[0013] Optionally, all marked change retention segments, change cancellation segments, and change suspension segments are aggregated according to the semantic relationship between fields to generate a change convergence fingerprint that represents the overall semantic evolution convergence state of the target data object within a preset monitoring interval; wherein, the change convergence fingerprint is used to represent the semantic change parts that have been stably retained during the continuous change process of the target data object, the semantic change parts that have been cancelled by subsequent changes, and the semantic change parts that have not yet converged.

[0014] Optionally, S2 further includes parsing the change convergence fingerprint, extracting the semantics of all fields marked as change retention segments, the historical change trajectory marked as change cancellation segments, and the status of undetermined fields marked as change suspension segments.

[0015] Optionally, the semantics of each field in the change retention section are aggregated and analyzed to identify stable semantic kernels that meet preset solidification conditions. The preset solidification conditions include: the duration of the current field semantics in the change retention section exceeds a first threshold, the confidence level of the underlying data source on which the current field semantics depends is higher than a second threshold, and there is no fundamental conflict between the current field semantics and the historical semantics in the change offset section. If a stable semantic kernel that meets the preset solidification conditions is identified, the semantic pointer corresponding to the stable semantic kernel is instantiated into a candidate label expression, and an identifier indicating that it is currently in a state to be verified is assigned to the candidate label expression, thereby generating a label unit to be verified.

[0016] Optionally, the tag unit to be proved includes the semantic content of the stable semantic kernel, the solidified evidence index of the semantic content in the change convergence fingerprint, and a marker to be resolved; the marker to be resolved is used to indicate that the tag unit to be proved needs to be tested by subsequent rebuttal evidence resolution processing before final solidification, so as to exclude possible reverse destruction evidence.

[0017] Optionally, S3 includes, after generating the tag unit to be verified, continuing to perform continuous change monitoring on the target data object and retrieving subsequent change events that occur in subsequent monitoring phases; the subsequent change events include reversal changes, substitution changes, corrective changes, or source overwrite changes.

[0018] Optionally, the rebuttal resolution process specifically includes comparing and analyzing the subsequent change event with the stable semantic kernel corresponding to the label unit to be proven, and determining whether the subsequent change event causes reverse damage to the stable semantic kernel; the determination conditions for causing reverse damage include: the subsequent change event directly affects the key field constituting the stable semantic kernel, and the changed field value has a fundamental conflict with the semantic reference of the stable semantic kernel; or, although the subsequent change event does not directly affect the key field, it causes the supporting basis of the stable semantic kernel to disappear by canceling or overwriting the underlying data source on which the stable semantic kernel depends.

[0019] Optionally, if it is determined that no reverse destruction has occurred, then the label unit to be proved has passed the disproven proof test, the label unit to be proved is solidified as the final label result, and the final label result is written into the label area of ​​the target data object;

[0020] If a reverse disruption is determined, a suppression mechanism is triggered to prevent the label unit to be verified from being solidified into the final label result. Based on the change information carried by the subsequent change event, the change convergence fingerprint of the target data object is updated so as to include the subsequent change event in the convergence analysis within the new monitoring interval.

[0021] The beneficial effects of this invention are:

[0022] This invention, when monitoring changes to a target data object, does not directly generate labels based solely on a single field modification. Instead, it records the sequence of change events within the monitoring interval, extracts change retention segments, change cancellation segments, and change suspension segments, and constructs a change convergence fingerprint based on this. This fingerprint comprehensively reflects the semantic evolution trajectory of the target data object during continuous changes, structurally distinguishing between stable semantic changes, historical changes cancelled by subsequent changes, and intermediate semantic states that have not yet converged. In this way, the truly stable parts of the data semantics during the change process can be identified, preventing premature label generation due to temporary modifications, misoperations, or short-term fluctuations, thereby significantly improving the stability and reliability of the label generation results.

[0023] This invention, after obtaining the convergence fingerprint of changes, further aggregates and analyzes the semantics of the fields in the change-preserving segment. It identifies stable semantic kernels by pre-setting solidification conditions. These solidification conditions simultaneously consider the duration of semantic existence, the confidence level of the data source, and historical semantic conflicts. Only when the semantics are stable in the long term, the source is reliable, and there is no fundamental conflict with historical semantics will a kernel be identified as a stable semantic kernel. Through this multi-condition joint judgment mechanism, unstable semantics caused by short-term data fluctuations, unreliable sources, or historical semantic conflicts can be effectively filtered out. This ensures that the semantic basis used to generate tags has high stability and credibility, further improving the semantic accuracy of the automatic tag generation system.

[0024] This invention continues to monitor changes after generating the tag unit to be verified, and performs counter-evidence resolution analysis on the stable semantic kernel through subsequent change events. When a reversal change, substitution change, corrective change, or source overwriting change is detected that reverses the stable semantic kernel, the system will trigger a suppression mechanism to prevent the tag from solidifying, and update the change convergence fingerprint based on the new change information, causing the change to re-enter a new semantic convergence analysis process. Through this counter-evidence resolution and dynamic update mechanism of the convergence structure, this invention can automatically correct the original tag judgment results when the data semantics evolve, enabling the tag generation process to have continuous verification and adaptive adjustment capabilities. This ensures that the final generated tags are always based on stable semantics that have undergone sufficient change verification, significantly improving the long-term reliability of the tag system in complex and dynamic data environments. Attached Figure Description

[0025] To more clearly illustrate the technical solutions in this invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only for this invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0026] Figure 1 This is a schematic diagram of the label generation method according to an embodiment of the present invention;

[0027] Figure 2 This is a schematic diagram of the tag generation logic in an embodiment of the present invention. Detailed Implementation

[0028] The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments. For some well-known technologies, those skilled in the art may also use other alternative methods to implement the invention. Moreover, the accompanying drawings are only for more specific description of the embodiments and are not intended to specifically limit the present invention.

[0029] like Figures 1-2As shown, an automatic tag generation method based on data change monitoring includes the following steps:

[0030] S1. Perform continuous change monitoring on the target data object. When field rewriting, field supplementation, field replacement, field withdrawal, or field association migration is detected, instead of directly generating a label based on a single change result, the change retention segment, change cancellation segment, and change suspension segment are extracted for the continuous change process of the target data object within a preset monitoring interval, and the corresponding change convergence fingerprint is constructed.

[0031] S11, within the preset monitoring interval, record operation events performed on the target data object. When a field is modified, supplemented, replaced, withdrawn, or its association is migrated, the corresponding operation is recorded as a change event and sorted according to the occurrence time to form a change event sequence; the change event sequence is represented as follows: Each change event It should include at least the field identifier, operation type, and snapshot information of field value changes, and be represented as: ;in, Represents a sequence of changing events. Indicates the first A change event, This indicates the field identifier affected by the change event. This indicates the operation type, which includes field rewriting, field supplementation, field replacement, field withdrawal, or field association migration. This represents a snapshot of the field's value before the operation occurred. This represents a snapshot of the field's value after the operation occurred. Indicates the timestamp of the change event. This indicates the total number of change events recorded within the preset monitoring interval.

[0032] The preset monitoring interval is a time range or event window during which the system continuously monitors data changes. It is set using a time window method, with a fixed time length, and collects all change events within that time period.

[0033] The above process creates a time-ordered sequence of change events for the target data object, which describes all field changes that occur within the monitoring period.

[0034] In S12, after obtaining the sequence of change events, instead of directly analyzing these changes from an overall perspective, the changes in the value of each field within the monitoring interval are tracked separately, forming a field-level change trajectory. This allows for further determination of whether these changes ultimately belong to retained changes, canceled changes, or still undetermined changes. First, the sequence of change events obtained in S11 is traversed. This sequence contains various change operations occurring in multiple fields at different times. To accurately analyze the changes of a specific field, change events belonging to the same field are extracted separately, forming a subsequence of changes corresponding to that field, ensuring each field has an independent change history. After obtaining the subsequence of changes for a field, the field value changes are re-organized chronologically, and the changed values ​​are connected sequentially to form the field value change trajectory of that field within the entire monitoring interval. This change trajectory essentially describes a series of changes that the field value has undergone from the start of the monitoring interval to the current moment. For example, if a field initially has value A, and during the monitoring period it is successively modified to B, then to C, and finally to D, then the field's change trajectory can be understood as A→B→C→D. This trajectory allows for a clear observation of the complete evolution of the field value. After obtaining the trajectory, the changes can be categorized and judged based on its characteristics.

[0035] Specifically, it involves traversing the sequence of change events in S11. For each field, the value change trajectory is tracked independently.

[0036] For any field Extract the changed subsequence corresponding to this field: ;

[0037] And construct the field value change trajectory based on the changing subsequence: ;

[0038] The states in the field value change trajectory correspond to the changes in the field's value within the monitoring interval in chronological order.

[0039] in, Representation field The subsequence of changing events, Representation field The trajectory of value changes, This indicates the initial field value at the beginning of the listening interval. This indicates the field value at the end of the current change sequence. This indicates the number of times the field has changed within the monitoring interval.

[0040] After obtaining the trajectory of field value changes, the semantic changes of the fields are classified and determined based on the trajectory features:

[0041] (1) When the final state of a field has a net change relative to its initial state and this change is not subsequently reversed or overwritten, i.e. If no subsequent changes occur that restore the field to its original state, then the final semantic change of the field is marked as a change-preserving segment.

[0042] (2) When a field undergoes an intermediate change in its trajectory, but this change is subsequently completely offset, for example: If the semantic change has been negated by subsequent changes, then the semantic change in that part is marked as a change cancellation segment.

[0043] (3) When a field is still in an unconfirmed state at the end of the change trajectory, such as when the associated migration has not been completed or the supplementary information is still in an unconfirmed state, the current intermediate semantic state of the field is marked as a change suspension segment.

[0044] Through the above classification and determination, the change trajectory of each field is divided into three semantic change structures: change retention segment, change cancellation segment, and change suspension segment. This step is actually to identify the true validity of field changes, providing a foundation for subsequent generation of change convergence fingerprints and label solidification determination.

[0045] S13, aggregate all marked change retention segments, change offset segments, and change suspension segments according to the semantic relationships between fields to generate a data fingerprint, i.e., a change convergence fingerprint, describing the overall semantic evolution convergence state of the target data object within the monitoring interval, represented as: Where F represents the change convergence fingerprint, R represents the set of change retention segments, used to record the semantic changes that have been stably retained, C represents the set of change offset segments, used to record the semantic changes that have been offset by subsequent changes during the change process, and H represents the set of change suspension segments, used to record the semantic changes that have not yet been confirmed as convergent. When generating the change convergence fingerprint, the change segments of each field are structured according to the semantic relationships between the fields, so that the change convergence fingerprint can simultaneously identify which semantic changes have been stably retained, which semantic changes have been offset by subsequent changes, and which semantic changes are still in a suspended state. This forms a data structure representing the overall semantic evolution convergence state of the target data object within the monitoring interval, and provides a basis for the label solidification determination in subsequent steps.

[0046] During aggregation, a field semantic association mapping table is first established or retrieved. This table describes fields belonging to the same semantic theme or business attribute. For example, company name, unified social credit code, and registered address can be grouped into the company identity semantic group, while industry category, business scope, and main business can be grouped into the industry semantic group. Next, all change retention segments, change offset segments, and change suspension segments corresponding to each field are traversed. Based on the field semantic association mapping table, change segments belonging to the same semantic group are grouped into the same semantic aggregation unit. During this process, each semantic aggregation unit internally counts the change retention segments, change offset segments, and change suspension segments appearing within that semantic group. Then, a corresponding change state description is generated for each semantic aggregation unit, indicating whether there are stable retention changes, offset changes, or suspension changes within the semantic group. Finally, the state information of all semantic aggregation units is combined to form a unified data structure. This structure consists of three parts: a set of change retention segments, a set of change offset segments, and a set of change suspension segments. It records the field semantic group and change semantic content corresponding to each type of change in a structured manner. This structured result is the convergence fingerprint of the target data object's changes within the current monitoring interval. It describes the convergence state of the object's overall semantic changes and provides a basis for subsequent label generation and solidification judgment.

[0047] S2, perform label solidification determination on the change convergence fingerprint, segment the change retention segment, change cancellation segment and change suspension segment in the change convergence fingerprint, identify the stable semantic kernel that meets the solidification condition, and generate the corresponding label unit to be proved based on the stable semantic kernel; wherein, the label unit to be proved is used to characterize the candidate label result of the target data object that has the possibility of label generation in the current continuous change process, but still needs to exclude reverse evidence.

[0048] S21, Analyzing the convergent fingerprint of changes Extract the semantics of all fields marked as change-preserved segments, the historical evolution trajectory of fields marked as change-off segments, and the undetermined field states of fields marked as change-suspended segments from the change convergence fingerprint; extract three sets from the change convergence fingerprint respectively:

[0049] For the change-preserving segment, extract the semantic information of the fields recorded therein. The semantic information represents the meaning of the fields that have been stably preserved after multiple changes within the monitoring period. For example, the industry type, enterprise category or attribute characteristics that a certain field is ultimately determined to have. This part of the semantics is an important basis for subsequent judgment on whether a label can be generated. ;

[0050] For the change cancellation segment, extract the recorded historical change trajectory. The historical change trajectory is when a certain field has undergone a certain semantic change during the change process, but the change is completely undone or restored in subsequent operations. For example, the field value is first modified to a new value, but is later changed back to the original value. Although such changes have occurred, they do not ultimately affect the final semantics of the data object. Therefore, they need to be recorded as historical trajectories for subsequent judgment on whether the new semantics conflict with the historical changes. ;

[0051] For the change suspension segment, extract the pending field status of the records. The field status indicates that the field is still in a state of change that has not yet been confirmed at the end of the current monitoring interval. For example, some relationships have not yet been migrated, or some fields have not yet been confirmed. Such semantics cannot be regarded as stable changes for the time being and need to be observed in subsequent stages. ;

[0052] in, This indicates a convergent fingerprint pattern. This represents the set of change-preserving segments, used to record the semantics of fields that are stably preserved within the monitoring interval. This represents the set of change cancellation segments, used to record the historical semantic trajectory that was completely canceled out by subsequent changes during the change process. This represents a set of change-pending segments used to record the semantics of fields that are currently in an unconfirmed state. Indicates the first Each change preserves the semantic unit of the segment. Indicates the first A change in low-level historical semantic trajectory unit Indicates the first Each variable suspended segment field is a status unit.

[0053] By analyzing the convergent fingerprint of changes, three types of semantic structures are obtained: stable preserved semantics, resolved semantics, and unconfirmed semantics, which provide input data for subsequent stable semantic kernel recognition.

[0054] S22. After parsing out the change-preserving segments, it is necessary to further analyze the stable retained field semantics to identify stable semantic kernels that can be used as the basis for tag generation. As mentioned above, some field semantics have been identified as belonging to the change-preserving segments. This means that these semantics were not ultimately revoked or overwritten within the monitoring interval and still remain valid semantics for the current data object. However, being in the change-preserving segments does not necessarily mean that the semantics are stable enough. For example, some fields may have recently changed, or the source data may have low reliability, or the semantics may have obvious conflicts with historical changes. If tags are generated directly based on these semantics, incorrect or unstable tag results may be produced.

[0055] Therefore, by performing aggregate analysis on the semantics of each field in the change retention section, stable semantic kernels that meet the preset solidification conditions are identified; let any semantic unit of any field in the change retention section be: ;in, , Indicates the field identifier. This indicates the semantic value that the field has been stably retained. It comes from the final state value of the corresponding field in the change retention segment. All change records of the field are found in the change event sequence. The field value change trajectory is constructed according to the time order. The field value that is still valid at the end of the monitoring interval is identified. If the value is not revoked or replaced in subsequent changes, the final field value is taken as the semantic value that is currently stably retained. This indicates the duration of the semantic meaning since its formation. The duration is calculated by the difference between the semantic formation time and the end time of the current monitoring interval. Find the time point when the current semantic meaning of this field first appears in the sequence of change events, record the timestamp of the first formation of the semantic meaning, obtain the end time of the current monitoring interval, and calculate the time difference between the two. This indicates the confidence level of the data source on which the semantics of this field depend. The confidence level of the source is pre-set for different data sources. First, identify the source system or source operation type of the semantics of this field. If it is synchronized from an official business system, it is set to 0.95. If it is a public data interface, it is set to 0.90. If it is automatically generated by an internal system, it is set to 0.80. If it is manually entered, it is set to 0.70.

[0056] When identifying stable semantic kernels, each variable-preserving segment semantic unit is... Determine whether it meets the following curing conditions: ; and this semantic Set of segments that offset changes There is no fundamental conflict in the historical semantics of the two languages.

[0057] in, This represents the first threshold, used to limit the minimum duration for which the semantics of a field persist. This represents the second threshold, which is used to limit the minimum requirement for the confidence level of the data source.

[0058] The first threshold value is set based on the stable period of business data:

[0059] High-frequency business data: 5 minutes to 30 minutes;

[0060] Regular business data: 1 hour to 24 hours;

[0061] Basic enterprise information data: 1 to 7 days;

[0062] The second threshold is set in the range of 0.7 to 0.9. If the system requires high data reliability, it is set to 0.85; if the system allows for more flexible data sources, it is set to 0.75.

[0063] When the semantics of a field simultaneously meet the above conditions, the semantics of that field are identified as stable semantic kernels, forming a set of stable semantic kernels: ;in, Represents a set of stable semantic kernels. Indicates the first A stable semantic kernel, Indicates the number of stable semantic kernels.

[0064] After identifying a stable semantic kernel that meets the solidification conditions in S23 and S22, the kernel is not written into the tag system. Instead, it is first converted into a structured candidate tag expression and further encapsulated into a data structure in a state of pending verification, namely a tag unit to be verified. This avoids the system generating the final tag too early. Instead, a tag candidate object that needs to be verified later is formed first and waits for the counter-verification detection in the subsequent listening stage to confirm whether the tag is truly stable and reliable.

[0065] Specifically, if a stable semantic kernel that meets the preset solidification conditions is identified, the semantic pointer corresponding to the stable semantic kernel is instantiated into a candidate label expression, and an identifier indicating its current state to be verified is assigned to the candidate label expression, thereby generating a label unit to be verified; any stable semantic kernel is Then its corresponding candidate label expression is expressed as: ;in, This represents the candidate label expression generated by the stable semantic kernel. The semantic pointer refers to the instantiation function, which is used to convert the stable semantic kernel into a tag expression structure.

[0066] Subsequently, the candidate label expressions are constructed into label units to be verified: ;in, Indicates the first One label unit to be certified, This represents the candidate tag expression, that is, the tag semantic content corresponding to the stable semantic kernel. This represents the solidified evidence index, used to record the evidence position of the stable semantic kernel in the change convergence fingerprint, such as the corresponding field, change event number, and change trajectory position. If it is necessary to trace back the basis for tag generation in subsequent stages, the original change evidence can be found through this index. This indicates a tag to be resolved, signifying that the tag unit is currently in a state of pending verification. This means that the tag still needs to undergo the rebuttal resolution process in subsequent steps before it is finally solidified. If new changes are found in the subsequent monitoring phase that cause reverse damage to the semantics, then the tag will be canceled from solidification.

[0067] The tag unit to be proved includes the semantic content of a stable semantic kernel, the solidified evidence index of the semantic content in the change convergence fingerprint, and the deconstruction mark. The deconstruction mark is used to indicate that the tag unit to be proved needs to be tested by subsequent rebuttal evidence deconstruction processing before it is finally solidified into a tag, so as to exclude possible reverse destruction evidence.

[0068] The semantic pointer to the execution flow of the instantiation function is as follows:

[0069] The first step is to read the field information from the stable semantic kernel. The stable semantic kernel contains two core elements: field identifiers and field semantic values. For example:

[0070] Field identifier: Industry category;

[0071] Field semantic value: High-end manufacturing;

[0072] The second step is to look up the tag mapping rule table. A mapping table from field semantics to tag structure is maintained in advance, as shown in Table 1 below.

[0073] Table 1 Tag Mapping Rules

[0074] Field Name Tag prefix Industry categories industry Enterprise size scale Business type business

[0075] Find the corresponding tag prefix in the table based on the field identifier.

[0076] The third step is to construct the tag expression structure by combining the tag prefix with the semantic value to generate a standard tag expression.

[0077] S3, perform disproven resolution processing on the label unit to be verified, retrieve the reversal changes, substitution changes, corrective changes or source overwriting changes that occur in the target data object during the subsequent listening phase, and determine whether they cause reverse destruction to the stable semantic kernel corresponding to the label unit to be verified; if no reverse destruction is caused, the label unit to be verified is solidified into the final label result and written into the label area of ​​the target data object; if reverse destruction is caused, the solidification output of the label unit to be verified is suppressed, and the change convergence fingerprint is updated.

[0078] S31, after generating the tag unit to be proven, does not immediately end the monitoring, but continues to continuously monitor changes in the target data object to observe whether new change events occur in the subsequent period. These subsequent change events can be used to determine whether the previously identified stable semantic kernel still holds true. S31 provides a data source for subsequent disproven evidence resolution. By continuing to monitor the changes in the data object, new change events occurring after the generation of the tag to be proven are collected, and these changes are used as the basis for subsequent verification of tag stability. Specifically, after a stable semantic kernel has been identified and a corresponding tag unit to be proven has been generated, the monitoring of changes in the target data object's fields continues. If some fields of the target data object change again in the following monitoring phase, the new change operation is recorded and organized into a set of subsequent change events. These subsequent change events are collected centrally to form a new set of change events. This new set of change events records all field changes that occurred after the generation of the tag to be proven. Each change event also records the field identifier, operation type, field value before the change, field value after the change, and the time information of the change. By continuously collecting these subsequent change events, we can obtain the semantic changes of the target data object after the generation of the label candidate. If the changes do not destroy the original stable semantic core, the label can be finally solidified. If these changes conflict with the stable semantic core, the generation of the label can be rejected. Therefore, S31 is actually establishing an observation period mechanism after the generation of the label to ensure that the generation of the label has sufficient stability and reliability.

[0079] Specifically, after generating the tag unit to be verified, continuous change monitoring is performed on the target data object, and subsequent change events occurring during the subsequent monitoring phase are retrieved; subsequent change events include reversible changes, substitution changes, corrective changes, or source overwrite changes. The set of change events detected during the subsequent monitoring phase is as follows: ;

[0080] Each subsequent change event is represented as: ;

[0081] in, Represents the set of subsequent change events. Indicates the first Subsequent change events. Indicates the field identifier that has changed. Indicates the type of operation, which includes undoable changes, alternative changes, corrective changes, or source overwrite changes. This indicates the field value before the change occurred. This indicates the field value after the change occurred. Indicates the timestamp of the change event. This indicates the number of change events detected during the subsequent monitoring phase. By continuously monitoring the changing state of the target data object, a new set of change events is obtained for subsequent rebuttal analysis.

[0082] Undoing a change refers to reversing or restoring a previous data change, returning the field value to its original state. Substituting a change means that the current value of a field is directly replaced by a new value, thus altering its semantics. Corrective changes refer to correcting or rectifying previously erroneous data. Source overwrite changes mean that the source of the field data has changed, with the new data source overwriting the data provided by the original source.

[0083] S32, compare and analyze subsequent change events with the stable semantic kernel corresponding to the tag unit to be verified, and determine whether the subsequent change events cause reverse damage to the stable semantic kernel. The stable semantic kernel corresponding to the tag unit to be verified is: ;in, This indicates the key field identifiers that constitute a stable semantic kernel. This indicates the stable semantic value corresponding to the key field.

[0084] For any subsequent change event If one of the following conditions is met, a reverse failure is determined to have occurred:

[0085] Condition 1: Key fields directly conflict: and ;in, This indicates that subsequent change events directly affect the key fields on which the stable semantic kernel depends. This indicates the changed field value. If the changed field value is inconsistent with the stable semantic kernel value, a semantic conflict is considered to have occurred.

[0086] Condition 2: Underlying data source failure: Let the data source identifier on which the stable semantic kernel depends be... If subsequent change events are source overwrite changes or reversal changes, and cause the data source to become invalid, then the following condition is met: ;in, The data source identifier that the stable semantic kernel depends on. This refers to the set of data sources that are currently still valid. When a data source is revoked or covered by a new source, it means that the evidentiary basis on which the stable semantic kernel relies is destroyed, thus constituting reverse destruction. Stable semantic kernels often rely on a specific data source, such as data synchronized from an official system, external data interfaces, or manually entered information. If subsequent events revoke or cover these data sources—for example, the original data source is deleted, the data is covered by a new source, or the original source data is withdrawn—then even if the field values ​​remain unchanged temporarily, the original semantic kernel loses its credible basis. In this case, the evidentiary basis of the original semantic kernel no longer exists, and therefore it is also considered reverse destruction.

[0087] The core purpose of step S32 is to compare and analyze subsequent changes with the stable semantic kernel upon which the previously generated labels were based after obtaining the information. This comparison determines whether the new changes have violated the conditions for the stability of the semantic kernel. If the new changes do not affect the stable semantic kernel, the semantic kernel can still be considered stable and reliable. However, if the new changes directly or indirectly undermine the foundation of the semantic kernel, the previously generated labels should not be retained. Essentially, this involves implementing a reverse evidence detection mechanism to identify whether new changes exist that could overturn the stable semantic kernel.

[0088] S33, if it is determined that no reverse destruction has occurred, then the tag unit to be verified has passed the disproven proof test. The tag unit to be verified is then solidified as the final tag result, and the final tag result is written into the tag area of ​​the target data object. A certain tag unit to be verified is... When all subsequent change events satisfy: If no reverse damage has occurred, then the label curing operation will be performed. ;in, This means for all, This indicates negation or that the statement is invalid. Represents a stable semantic kernel. Indicates a change event For stable semantic kernels A decision function for reverse destruction is formed. This represents the final labeling result, which is then written to the label area of ​​the target data object, completing the label solidification. The above formula indicates that for all subsequent change events, there is no situation that disrupts the stable semantic kernel.

[0089] S34, if a reverse disruption is determined, a suppression mechanism is triggered to prevent the tag unit to be verified from being solidified into the final tag result. Based on the change information carried by subsequent change events, the change convergence fingerprint of the updated target data object is returned to include subsequent change events in the convergence analysis within the new monitoring interval. Assume there exists a subsequent change event that satisfies: This triggers the label suppression mechanism. And update the convergent fingerprint: ;in, This represents the reverse failure determination function; a value of 1 indicates that reverse failure has occurred. This is the final label result; it will be empty if a reverse destruction occurs. This indicates the original change convergent fingerprint. Represents the set of subsequent change events. This indicates that the updated fingerprint converges. This indicates that at least one exists. The change convergence fingerprint update function is used to incorporate subsequent change events into the new semantic convergence analysis structure. Its function is to reintegrate new change events occurring in subsequent monitoring phases into the original change convergence structure and re-divide the change retention segment, change cancellation segment, and change suspension segment. The specific execution process is as follows:

[0090] The first step is to read the three sets of changes in the original change convergence fingerprint, including the set of changes retained segments, the set of changes canceled segments, and the set of changes suspended segments.

[0091] The second step is to match each change event in the subsequent change event set with the original change structure according to the field identifier, and update the change trajectory of the corresponding field.

[0092] The third step is to re-evaluate the semantic change status of the field based on the new change trajectory. For example, if a field originally belonged to the change retention section, but a new change event replaced or revoked the field, then the original retained semantics of the field will be moved to the change cancellation section.

[0093] The fourth step is to re-determine whether the new semantic state resulting from the new changes belongs to the change retention segment, the change cancellation segment, or the change suspension segment based on the change trajectory.

[0094] The fifth step is to recombine the updated three types of change segments to form a new change convergence fingerprint, and use this fingerprint as input data for the semantic convergence analysis of the next listening interval.

[0095] Through the above processing, the reverse changes that occur in the subsequent monitoring stage can prevent the tags from being fixed prematurely, and the new change information can be reintegrated into the change convergence analysis process, thereby ensuring the stability and reliability of the final generated tag results.

[0096] This invention encompasses any substitutions, modifications, equivalent methods, and solutions made within the spirit and scope of this invention. To provide the public with a thorough understanding of this invention, specific details are described in detail in the following preferred embodiments; however, those skilled in the art will fully understand the invention even without these details. Furthermore, to avoid unnecessary misunderstanding of the essence of this invention, well-known methods, processes, procedures, components, and circuits are not described in detail.

[0097] The above description is only a preferred embodiment of the present invention. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. An automatic tag generation method based on data change monitoring, characterized in that, Includes the following steps: S1, perform continuous change monitoring on the target data object. When field rewriting, field supplementation, field replacement, field withdrawal or field association migration is detected, instead of directly generating a label based on a single change result, the change retention segment, change cancellation segment and change suspension segment are extracted for the continuous change process of the target data object within the preset monitoring interval, and the corresponding change convergence fingerprint is constructed. S2, perform label solidification determination on the change convergence fingerprint, segment the change retention segment, change cancellation segment and change suspension segment in the change convergence fingerprint, identify the stable semantic kernel that meets the solidification condition, and generate the corresponding label unit to be proved based on the stable semantic kernel. S3, perform counter-evidence resolution processing on the tag unit to be verified, retrieve the revocation changes, substitution changes, corrective changes or source overwriting changes that occur in the target data object during the subsequent listening phase, and determine whether they cause reverse destruction to the stable semantic kernel corresponding to the tag unit to be verified. If no reverse destruction occurs, the label unit to be certified is solidified into the final label result and written into the label area of ​​the target data object; If reverse destruction occurs, the solidified output of the tag unit to be certified is suppressed, and the change convergence fingerprint is updated.

2. The automatic tag generation method based on data change monitoring according to claim 1, characterized in that, Within the preset monitoring interval, the operation events of the target data object are recorded each time a field is rewritten, added, replaced, withdrawn, or migrated, forming a sequence of change events ordered by time. The operation events include at least the field identifier affected, the operation type, and a snapshot of the field values ​​before and after the operation.

3. The automatic tag generation method based on data change monitoring according to claim 1, characterized in that, Traverse the sequence of change events and independently track the change trajectory of the value of each field: if the final state of a field at the end of the sequence has undergone a net change compared to the beginning of the sequence and the net change is not subsequently undone or overwritten, then mark the final semantics of the current field as a change retention segment; if the change of a field that occurs in the middle of the sequence is subsequently completely canceled, then mark the canceled change semantics as a change cancellation segment. If a field is still in an unconfirmed state at the end of the sequence, then the current intermediate semantics of the current field will be marked as a change suspension segment.

4. The automatic tag generation method based on data change monitoring according to claim 1, characterized in that, All marked change retention segments, change cancellation segments, and change suspension segments are aggregated according to the semantic relationship between fields to generate a change convergence fingerprint that represents the overall semantic evolution convergence state of the target data object within a preset monitoring interval. The change convergence fingerprint is used to represent the semantic change parts that have been stably retained during the continuous change process of the target data object, the semantic change parts that have been cancelled by subsequent changes, and the semantic change parts that have not yet converged.

5. The automatic tag generation method based on data change monitoring according to claim 1, characterized in that, S2 further includes parsing the change convergence fingerprint, extracting the semantics of all fields marked as change retention segments, the historical change trajectory marked as change offset segments, and the status of undetermined fields marked as change suspension segments.

6. The automatic tag generation method based on data change monitoring according to claim 5, characterized in that, The semantics of each field in the change retention section are aggregated and analyzed to identify the stable semantic kernel that meets the preset solidification conditions; The preset solidification conditions include: the current field semantics persists in the change retention section for a duration exceeding a first threshold, the confidence level of the underlying data source upon which the current field semantics depends is higher than a second threshold, and there is no fundamental conflict between the current field semantics and the historical semantics in the change offset section; if a stable semantic kernel that satisfies the preset solidification conditions is identified, the semantic pointer corresponding to the stable semantic kernel is instantiated into a candidate label expression, and an identifier indicating that the candidate label expression is currently in a state to be verified is assigned to the candidate label expression, thereby generating a label unit to be verified.

7. The automatic tag generation method based on data change monitoring according to claim 6, characterized in that, The tag unit to be proved includes the semantic content of the stable semantic kernel, the solidified evidence index of the semantic content in the change convergence fingerprint, and a marker to be resolved; the marker to be resolved is used to indicate that the tag unit to be proved needs to be tested by subsequent rebuttal evidence resolution processing before final solidification, so as to exclude possible reverse destruction evidence.

8. The automatic tag generation method based on data change monitoring according to claim 1, characterized in that, S3 includes, after generating the tag unit to be certified, continuing to perform continuous change monitoring on the target data object and retrieving subsequent change events that occur in subsequent monitoring phases; the subsequent change events include reversal changes, substitution changes, corrective changes, or source overwrite changes.

9. The automatic tag generation method based on data change monitoring according to claim 8, characterized in that, The specific process of rebuttal resolution includes comparing and analyzing the subsequent change event with the stable semantic kernel corresponding to the label unit to be proved, and determining whether the subsequent change event causes reverse damage to the stable semantic kernel; The conditions for determining reverse destruction include: the subsequent change event directly affects the key field constituting the stable semantic kernel, and the changed field value fundamentally conflicts with the semantic orientation of the stable semantic kernel; or, although the subsequent change event does not directly affect the key field, it causes the supporting basis of the stable semantic kernel to disappear by canceling or overwriting the underlying data source on which the stable semantic kernel depends.

10. The automatic tag generation method based on data change monitoring according to claim 9, characterized in that, If it is determined that no reverse destruction has occurred, then the label unit to be proved has passed the disproven proof test, the label unit to be proved is solidified as the final label result, and the final label result is written into the label area of ​​the target data object; If a reverse disruption is determined, a suppression mechanism is triggered to prevent the label unit to be verified from being solidified into the final label result. Based on the change information carried by the subsequent change event, the change convergence fingerprint of the target data object is updated so as to include the subsequent change event in the convergence analysis within the new monitoring interval.