A data set quality evaluation method and device based on a hierarchical index system and a storage medium

By using a hierarchical indicator system and a uniqueness constraint evaluation method, the problems of fixed standards and consistency in dataset quality evaluation were solved, the stability and traceability of evaluation results were achieved, maintenance costs were reduced, and evaluation efficiency was improved.

CN122220179APending Publication Date: 2026-06-16BEIJING ZHONGLIANG INTELLIGENT NUMBER TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING ZHONGLIANG INTELLIGENT NUMBER TECHNOLOGY CO LTD
Filing Date
2026-03-04
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing dataset quality assessment technologies have shortcomings in terms of standardization, consistency assurance, and evidence traceability, making it difficult to achieve rapid reuse, stability, and traceability.

Method used

A hierarchical indicator system is adopted, the evaluation criteria are solidified through the evaluation scheme, uniqueness constraints are set, structured evaluation records are generated, and they are stored in the evidence chain according to the indicator hierarchy path to ensure the uniqueness and traceability of the evaluation results.

🎯Benefits of technology

It enables the reuse of the same set of indicators across tasks, reduces the cost of redundant development, ensures the stability and comparability of evaluation results, and improves the traceability and verifiability of the evaluation process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122220179A_ABST
    Figure CN122220179A_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on layered index system data set quality evaluation method, device and storage medium, belong to data management technical field, the method includes: obtaining the evaluation scheme and layered index system corresponding to the to-be-evaluated;Wherein, evaluation scheme is used to solidify evaluation caliber;Executable evaluation plan is generated based on evaluation scheme;The evaluation plan is executed, and the structured evaluation record is generated by calling evaluation operator;Wherein, the structured evaluation record follows the uniqueness constraint, and it includes problem unique identification and evidence pointer;Structured evaluation record is stored in evidence chain according to index level path;Based on the problem unique identification in evidence chain, the structured evaluation record is de-duplicated and counted, and label statistical result is generated;According to label statistical result, the layered score and total score of each index level are calculated, and when satisfying veto decision rule, veto conclusion and corresponding evidence pointer are output. Reliable, traceable and reusable data quality evaluation is realized.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of data governance technology, and in particular relates to a method, device and storage medium for evaluating the quality of datasets based on a hierarchical indicator system. Background Technology

[0002] In engineering practice, dataset quality assessment typically relies on manual sampling, rule scripts, or model inference to output scores and conclusions. However, with the expansion of data governance scale and the increasing complexity of assessment scenarios, existing technical solutions have exposed a number of structural defects.

[0003] First, existing methods often directly encode the evaluation content within operators or scripts, resulting in a hard-coded evaluation implementation. This coupled architecture makes it difficult to quickly reuse the same indicator system across different datasets or projects. Each new task requires the development of a new adaptation script, resulting in a significant amount of repetitive work. Furthermore, when the indicator system needs version iteration or caliber adjustment, the code must be modified layer by layer and re-verified, leading to high maintenance costs and a high risk of version inconsistencies.

[0004] Secondly, to meet the evaluation needs of large-scale datasets, modern data engineering generally introduces workflow scheduling, sharded parallelism, and failure retry mechanisms. However, in existing technologies, the same operator may generate multiple duplicate records in retry or concurrent scenarios, the same problem may be counted multiple times, and the score of the same indicator may fluctuate under different operating conditions, making the evaluation conclusions unstable and damaging the credibility of the evaluation results.

[0005] Furthermore, in traditional solutions, evaluation operators typically output the final score or judgment directly, rather than the original evaluation record. This design makes it impossible to pinpoint which specific sample triggered the problem label, trace the original evidence for a deduction, or independently evaluate and optimize the operator's effectiveness afterward. When evaluation results cause disputes or require audit review, there is a lack of identifiable and replayable evidence to support them.

[0006] In summary, existing dataset quality assessment technologies have shortcomings in terms of standardization, consistency assurance, and evidence traceability, and urgently need to be improved. Summary of the Invention

[0007] The purpose of this invention is to overcome the shortcomings of the prior art and provide a dataset quality evaluation method, device and storage medium based on a hierarchical indicator system. The hierarchical indicator system and evaluation scheme solidify the evaluation criteria, use uniqueness constraints to set evaluation result records, maintain consistency of criteria under concurrent and retry conditions, and use the chain of evidence as the input source for calculation and adjudication, so that the results can be located, replayed and verified.

[0008] To achieve the above objectives, the present invention is implemented using the following technical solution:

[0009] In a first aspect, the present invention provides a dataset quality evaluation method based on a hierarchical indicator system, comprising: Obtain the evaluation scheme and hierarchical indicator system corresponding to the task to be evaluated; wherein, the hierarchical indicator system includes multi-level indicators; the evaluation scheme is used to configure indicator weights, rejection rules, and determine the activation relationship between the hierarchical indicator system and the evaluation operators; An executable evaluation plan is generated based on the evaluation scheme; The evaluation plan is executed, and the evaluation operators are invoked to generate structured evaluation records. The structured evaluation records include a unique identifier for the problem and an evidence pointer. The structured evaluation records follow a uniqueness constraint: the same evaluation task corresponds to a unique evaluation record in both the task and indicator dimensions and the task, indicator, and operator dimensions. The structured evaluation records are stored in the evidence chain according to the indicator hierarchy path; Based on the unique identifier of the issue in the chain of evidence, the structured evaluation records are deduplicated and statistically analyzed to generate tag statistical results; Based on the statistical results of the labels, the hierarchical scores and total scores of each indicator level are calculated, and when the veto judgment rules are met, the veto conclusion and the corresponding evidence pointer are output; wherein, the total score is obtained based on the indicator weights.

[0010] Optionally, the hierarchical indicator system includes at least first-level indicators, second-level indicators, and third-level indicators, with each level of indicator being an executable evaluation target indicator. The process of generating an executable evaluation plan based on the evaluation scheme includes: Construct an indicator tree containing hierarchical relationships between indicators based on the evaluation scheme; The indicator tree is recursively compiled to generate an evaluation plan containing multi-level nodes. The multi-level nodes include multi-level indicator execution nodes, upper-level indicator aggregation nodes, and result aggregation nodes. The multi-level indicator execution nodes include first-level indicator execution nodes, second-level indicator execution nodes, and third-level indicator execution nodes. The evaluation plan limits the execution order of each level of indicator execution nodes, upper-level indicator aggregation nodes, and result aggregation nodes through dependencies.

[0011] Optionally, the method for setting uniqueness constraints on structured evaluation records before executing the evaluation plan includes: Pre-create indicator result records and evaluation operator result records for the tasks to be evaluated; Set a unique constraint that includes a combination of tasks and metrics, so that the same task to be evaluated corresponds to only one unique metric result record for the same metric. Set a unique constraint for the combination of tasks, metrics, and operators, so that the same task to be evaluated corresponds to only one unique operator result record for the same combination of metrics and evaluation operators. When concurrent writes or failed retries occur, existing indicator result records or evaluation operator result records are merged and updated to isolate the impact of changes in the number of retries in the statistical scope.

[0012] Optionally, executing the evaluation plan and invoking the evaluation operator to generate structured evaluation records includes: According to the evaluation plan, the evaluation operators bound to each level of indicators are called through the multi-level indicator execution nodes to evaluate each sample unit; wherein, the sample unit corresponds to the basic unit that can be independently evaluated in the evaluation object of the task to be evaluated; The evaluation operator outputs a structured evaluation record, which includes at least: sample location information, indicator identifier, operator identifier, problem label, evidence pointer, execution status and time consumption information; the sample location information and problem label constitute a unique problem identifier; the evidence pointer is used to establish a verifiable association between the hit result corresponding to the problem label and the sample unit position indicated by the sample location information.

[0013] Optionally, storing the structured evaluation records into the evidence chain according to the indicator hierarchy includes: The structured evaluation records are written into the evidence chain in a columnar structured manner for storage; The structured evaluation records stored in the evidence chain are organized according to a preset hierarchical path. The indicator levels in the hierarchical path correspond to the indicator levels in the indicator tree. The nodes of the hierarchical path include, in sequence: task, first-level indicator, second-level indicator, third-level indicator, and operator.

[0014] Optionally, the step of performing deduplication and statistical analysis on the structured evaluation records based on the content of the evidence chain to generate tag statistical results includes: Extract sample location information and issue tags from the current chain of evidence; The combination of sample location information and question labels is used as the deduplication key to perform deduplication statistics on the structured evaluation records, so that the same question in the same sample unit is counted only once, and the label statistics results are generated.

[0015] Optionally, the calculation of the stratified scores and total scores for each indicator level based on the label statistical results includes: The third-level indicator execution node obtains the theoretical maximum number of detectable inspection items and the actual number of correct inspection items based on the label statistical results, and obtains the third-level indicator score based on the theoretical maximum number of detectable inspection items and the actual number of correct inspection items; wherein, the theoretical maximum number of detectable inspection items is calculated based on the sample size of the sample unit and the number of rejection items, and the sample size of the sample unit is obtained based on the label statistical results. The second-level indicator execution node performs weighted aggregation on the scores of the third-level indicators according to the indicator weights to obtain the second-level indicator hierarchical scores. The first-level indicator stratified score is obtained by weighting and aggregating the stratified scores of the second-level indicator according to the indicator weights through the first-level indicator execution node. The scores of each first-level indicator are integrated through the upper-level indicator aggregation node, and the total score is calculated through the result aggregation node based on the integration results of the upper-level indicator aggregation node.

[0016] Optionally, the structured evaluation record further includes an evidence summary, which is used to replace the evidence pointer when the evidence pointer cannot be saved or is invalid; The evidence digest includes at least sample location information, hit fields or paths, offset information of hit segments, and feature identifiers for consistency verification.

[0017] Secondly, the present invention provides a dataset quality evaluation device based on a hierarchical indicator system, comprising: Evaluation scheme acquisition module: used to acquire the evaluation scheme and hierarchical indicator system corresponding to the evaluation target; wherein, the hierarchical indicator system includes multi-level indicators; the evaluation scheme is used to configure indicator weights, rejection judgment rules, and determine the activation relationship between the hierarchical indicator system and the evaluation operators; Evaluation plan acquisition module: used to generate an executable evaluation plan based on the evaluation scheme; Evaluation record acquisition module: used to execute the evaluation plan and call the evaluation operator to generate structured evaluation records; wherein, the structured evaluation records follow the uniqueness constraint: the same evaluation task corresponds to a unique evaluation record in both the task and indicator dimensions and the task, indicator and operator dimensions; Evidence Chain Update Module: Used to store the structured evaluation records into the evidence chain according to the indicator hierarchy path; Tag statistics result acquisition module: used to perform deduplication and statistical analysis on structured evaluation records based on the content in the evidence chain, and generate tag statistics results; Evaluation score acquisition module: used to calculate the hierarchical score and total score of each indicator level based on the statistical results of the tags, and output the rejection conclusion and its associated evidence index when the rejection judgment rule is met; wherein, the total score is obtained based on the indicator weight.

[0018] Thirdly, the present invention provides a computer storage medium having a computer program stored thereon, which, when executed by a processor, implements the dataset quality evaluation method based on a hierarchical indicator system as described in any of the first aspects.

[0019] Compared with existing technologies, the beneficial effects achieved by this invention are as follows: By solidifying the evaluation content settings such as indicator weights, rejection judgment rules, and evaluation operator activation relationships in the evaluation scheme, the evaluation criteria and execution logic are decoupled, allowing the same indicator system to be reused across tasks and avoiding the risk of version inconsistency, thus reducing the cost of repeated development. Furthermore, by setting unique constraints on the task and indicator dimensions, and the task, indicator, and operator dimensions of the structured evaluation records, and performing merge updates during concurrent writes or failure retries, the scoring fluctuations caused by the same issue being counted repeatedly are avoided, ensuring the stability and comparability of the evaluation conclusions. By storing the structured evaluation records into the evidence chain according to the indicator hierarchy path, the association between the unique identifier of the issue and the evidence pointer is established, making the evaluation process traceable, replayable, and verifiable, thereby improving the traceability of the evaluation conclusions and enhancing the governance efficiency of dataset quality evaluation. Attached Figure Description

[0020] Figure 1 The diagram shows a flowchart of a dataset quality evaluation method based on a hierarchical index system in one embodiment of the present invention. Detailed Implementation

[0021] The present invention will be further described below with reference to the accompanying drawings. The following embodiments are only used to more clearly illustrate the technical solution of the present invention, and should not be used to limit the scope of protection of the present invention.

[0022] Example 1

[0023] like Figure 1 As shown in the figure, this embodiment provides a dataset quality evaluation method based on a hierarchical indicator system. This method achieves stable evaluation of dataset quality under distributed and retry conditions by solidifying the evaluation criteria through the evaluation scheme, driving the execution through the evaluation plan, ensuring consistency through uniqueness constraints, and supporting replay verification through the evidence chain.

[0024] S1: Evaluation Subjects and Task Input

[0025] The system organizes the evaluation objects of the task to be evaluated into a set of sample units. Each sample unit corresponds to a basic unit in the dataset that can be independently evaluated and has sample location information that can be used for location playback. After the evaluation task is created, the evaluation scheme and dataset version information are associated, and a task identifier is generated for subsequent evidence chain isolation and result traceability.

[0026] S2: Obtain the evaluation scheme and hierarchical indicator system corresponding to the task to be evaluated.

[0027] The system obtains the evaluation scheme and hierarchical indicator system corresponding to the task to be evaluated. The hierarchical indicator system includes multiple levels of indicators, including at least first-level, second-level, and third-level indicators. Each level of indicator is a target indicator for the executable evaluation. The evaluation scheme is used to configure indicator weights, scoring parameters, sampling strategies, rejection rules, and determine the activation relationship between the hierarchical indicator system and evaluation operators, enabling the evaluation criteria to be reused across tasks and supporting version freezing.

[0028] S3: Generate an executable evaluation plan based on the evaluation scheme.

[0029] The system constructs an indicator tree containing hierarchical relationships based on the evaluation scheme, and recursively compiles the indicator tree to generate an evaluation plan containing multi-level nodes. The multi-level nodes include multi-level indicator execution nodes, upper-level indicator aggregation nodes, and result aggregation nodes; the multi-level indicator execution nodes include first-level indicator execution nodes, second-level indicator execution nodes, and third-level indicator execution nodes; the upper-level indicator aggregation nodes are responsible for aggregating lower-level results, and the result aggregation nodes form the total score and grade. After compilation, each level node carries indicator identifiers, indicator weights, and rejection switches and rejection threshold parameters for rejection determination, enabling scoring and rejection to be completed during the execution phase without re-analyzing the scheme criteria. The evaluation plan limits the execution order of each level node through directed acyclic dependencies: from the third-level indicator execution node to the second-level indicator execution node, from the second-level indicator execution node to the first-level indicator execution node, from the first-level indicator execution node to the upper-level indicator aggregation node, and from the upper-level indicator aggregation node to the result aggregation node.

[0030] S4: Uniqueness Constraints and Idempotency Consistency Guarantees

[0031] Before executing the evaluation plan, the system pre-creates indicator result records and evaluation operator result records for the tasks to be evaluated, and sets uniqueness constraints on the structured evaluation records. In this embodiment, a unique constraint is set that includes combinations of tasks and indicators, ensuring that the same task to be evaluated corresponds to only one indicator result record for the same indicator; a unique constraint is also set that includes combinations of tasks, indicators, and operators, ensuring that the same task to be evaluated corresponds to only one operator result record for the same combination of indicator and evaluation operator. When concurrent writes or failed retries occur, existing indicator result records or evaluation operator result records are merged and updated to maintain consistency in statistical and scoring methods and avoid the same issue being counted repeatedly, which could lead to scoring fluctuations.

[0032] S5: Execute the evaluation plan and generate structured evaluation records.

[0033] The system executes multi-level indicator execution nodes according to the evaluation plan, calls the evaluation operators bound to each level of indicators to evaluate each sample unit, and outputs structured evaluation records through the evaluation operators. The evaluation operators do not directly output the final score, decoupling execution and calculation, which facilitates operator expansion and standardization. In this embodiment, the structured evaluation record includes at least: sample location information, indicator identifier, operator identifier, issue label, evidence pointer, execution status, and time consumption information. The evidence pointer is used to establish a verifiable association between the hit result corresponding to the issue label and the sample unit location indicated by the sample location information. Based on the issue label, it points to the sample content location that can be verified and located, thereby supporting subsequent replay verification and pointing to the sample content location that can be verified and located.

[0034] When the evidence pointer cannot be directly saved or cannot remain valid across storage media, the structured evaluation record also includes an evidence summary. The evidence summary includes at least sample location information, hit fields or paths, offset information of hit segments, and feature identifiers for consistency verification, thereby completing the review without rerunning the operators. The sample location information includes at least a sample identifier for uniquely identifying the sample unit, and location elements for determining the internal position of the sample unit; the sample identifier includes a dataset version identifier and a sample record identifier or line number; the location elements include at least one of field name, data path, and segment offset information; when the evaluation object is an image, audio, or video, the segment offset information includes at least one of coordinate frame, timestamp, or frame sequence number.

[0035] S6: Store the structured evaluation records into the evidence chain according to the indicator hierarchy path.

[0036] The system uses a structured storage format to save structured evaluation records into the evidence chain, preferably in a columnar storage format. The structured evaluation records stored in the evidence chain are organized according to a hierarchical path of task-first-level indicator-second-level indicator-third-level indicator-operator. The indicator levels in the hierarchical path correspond to the indicator levels in the indicator tree, which can simultaneously meet the requirements of task isolation and indicator location. This allows for the rapid location and playback verification of evidence according to the indicator level, and also facilitates subsequent statistical analysis of the evidence.

[0037] S7: Perform deduplication and statistical analysis based on the evidence chain and generate label statistics results.

[0038] The system reads sample location information and issue labels from the current evidence chain, uses the combination of sample location information and issue labels as a deduplication key, and performs deduplication statistics on the structured evaluation records to ensure that the same issue in the same sample unit is counted only once, generating label statistics results. Deduplication statistics are only used for label statistics and scoring aggregation at the indicator level. The structured evaluation records of evaluation operators are still fully preserved in the evidence chain according to the dimensions of task, indicator, and operator, for replay verification and operator effect comparison.

[0039] The conflict convergence rules for deduplication statistics include: the same label corresponding to the same sample location information is counted only once; when the label statistics results are summarized, the number of issues is accumulated and the example evidence is merged, and the number of example evidence does not exceed a preset upper limit.

[0040] S8: Calculate the stratified score and total score, and output the rejection conclusion.

[0041] The third-level indicator execution node receives sample unit data, executes the check items one by one according to the fixed set of identifiable check items in the evaluation scheme, and counts the theoretical maximum number of identifiable items determined by the sample unit sampling number and the number of rejection check items. The sample unit sampling number is determined by the number of sample location information after deduplication in the label statistics results. It identifies and records the number of problem hits in the label statistics results. The number of problem hits is the number of non-compliant items after deduplication determined by the check item-problem label mapping relationship. The actual correct number is obtained by subtracting the theoretical maximum number of identifiable items from the number of problem hits. Then, the score corresponding to the ratio of the actual correct number to the theoretical maximum number of identifiable items is calculated, and the third-level indicator score, the problem hit list, and the judgment details are output.

[0042] The second-level indicator execution node receives the output results of each third-level indicator execution node below, obtains the fixed indicator weight configuration in the evaluation scheme, performs weighted aggregation of the scores of each third-level indicator below according to the indicator weight, calculates the stratified score of the second-level indicator, and outputs the lower-level aggregation details and weight application records.

[0043] The first-level indicator execution node receives the output results of each second-level indicator execution node below. Similarly, it performs weighted aggregation on the scores of each second-level indicator according to the indicator weight configuration, calculates the scores of the first-level indicator, and outputs the lower-level aggregation details and weight application records as input to the upper-level indicator aggregation node.

[0044] The upper-level indicator aggregation node is responsible for aggregating the results of the lower level. It receives the output of the first-level indicator execution node, integrates the scores of each first-level indicator according to the preset aggregation rules, generates the mapping relationship between the hierarchical summary results and the scores of each level of indicator, and passes it to the result aggregation node, which then outputs the total score.

[0045] When the aforementioned rejection judgment rule is met, a rejection conclusion and its associated evidence index are output. The associated evidence index points to the corresponding structured evaluation record in the evidence chain, which is used for review and rectification closure.

[0046] Example 2

[0047] This embodiment provides a dataset quality evaluation device based on a hierarchical indicator system, including: Evaluation scheme acquisition module: used to acquire the evaluation scheme and hierarchical indicator system corresponding to the evaluation target; wherein, the hierarchical indicator system includes multi-level indicators; the evaluation scheme is used to configure indicator weights, rejection judgment rules, and determine the activation relationship between the hierarchical indicator system and the evaluation operators; Evaluation plan acquisition module: used to construct an indicator tree based on the evaluation scheme and generate an executable evaluation plan; the evaluation plan organizes the indicator execution nodes, upper-level indicator aggregation nodes and result aggregation nodes at each level through dependency relationships; Evaluation record acquisition module: used to execute the evaluation plan and call the evaluation operator to generate structured evaluation records; wherein, the structured evaluation records follow the uniqueness constraint: the same evaluation task corresponds to a unique evaluation record in both the task and indicator dimensions and the task, indicator and operator dimensions; when concurrent writes or failure retries occur, the existing result records are merged and updated; Evidence Chain Update Module: Used to store the structured evaluation records into the evidence chain in a columnar structured manner according to the indicator hierarchy path; the nodes of the hierarchy path include, in order: task, first-level indicator, second-level indicator, third-level indicator, operator; The tag statistics result acquisition module is used to perform deduplication statistics on structured evaluation records based on the content in the evidence chain, using the combination of sample location information and question tags as the deduplication key, and generate tag statistics results. Evaluation score acquisition module: It is used to calculate the scores of each level of indicators based on the theoretical maximum number of judgments and the number of questions hit in the label statistics results, aggregate the scores according to the indicator weights to obtain the stratified scores and the total score, and output the rejection conclusion and its associated evidence index when the rejection judgment rules are met.

[0048] The apparatus provided in this embodiment can execute the dataset quality evaluation method based on the hierarchical index system provided in any step of Embodiment 1, and has the corresponding functional modules and beneficial effects of the execution method.

[0049] Example 3

[0050] This embodiment provides a computer storage medium storing a computer program. When the computer program is executed by a processor, it implements the dataset quality evaluation method based on a hierarchical indicator system as provided in any step of Embodiment 1.

[0051] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0052] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0053] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0054] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0055] The embodiments of the present invention have been described above with reference to the accompanying drawings. However, the present invention is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of the present invention without departing from the spirit and scope of the claims. All of these forms are within the protection scope of the present invention.

Claims

1. A dataset quality evaluation method based on a hierarchical indicator system, characterized in that, include: Obtain the evaluation scheme and hierarchical indicator system corresponding to the task to be evaluated; wherein, the hierarchical indicator system includes multi-level indicators; the evaluation scheme is used to configure indicator weights, rejection rules, and determine the activation relationship between the hierarchical indicator system and the evaluation operators; An executable evaluation plan is generated based on the evaluation scheme; The evaluation plan is executed, and the evaluation operators are invoked to generate structured evaluation records. The structured evaluation records include a unique identifier for the problem and an evidence pointer. The structured evaluation records follow a uniqueness constraint: the same evaluation task corresponds to a unique evaluation record in both the task and indicator dimensions and the task, indicator, and operator dimensions. The structured evaluation records are stored in the evidence chain according to the indicator hierarchy path; Based on the unique identifier of the issue in the chain of evidence, the structured evaluation records are deduplicated and statistically analyzed to generate tag statistical results; Based on the statistical results of the labels, the hierarchical scores and total scores of each indicator level are calculated, and when the veto judgment rules are met, the veto conclusion and the corresponding evidence pointer are output; wherein, the total score is obtained based on the indicator weights.

2. The dataset quality evaluation method based on a hierarchical indicator system according to claim 1, characterized in that, The hierarchical indicator system includes at least first-level indicators, second-level indicators, and third-level indicators, with each level of indicator being an executable evaluation target indicator. The process of generating an executable evaluation plan based on the evaluation scheme includes: Construct an indicator tree containing hierarchical relationships between indicators based on the evaluation scheme; The indicator tree is recursively compiled to generate an evaluation plan containing multi-level nodes. The multi-level nodes include multi-level indicator execution nodes, upper-level indicator aggregation nodes, and result aggregation nodes. The multi-level indicator execution nodes include first-level indicator execution nodes, second-level indicator execution nodes, and third-level indicator execution nodes. The evaluation plan limits the execution order of each level of indicator execution nodes, upper-level indicator aggregation nodes, and result aggregation nodes through dependencies.

3. The dataset quality evaluation method based on a hierarchical indicator system according to claim 2, characterized in that, Methods for setting uniqueness constraints on structured evaluation records before executing the evaluation plan include: Pre-create indicator result records and evaluation operator result records for the tasks to be evaluated; Set a unique constraint that includes a combination of tasks and metrics, so that the same task to be evaluated corresponds to only one unique metric result record for the same metric. Set a unique constraint for the combination of tasks, metrics, and operators, so that the same task to be evaluated corresponds to only one unique operator result record for the same combination of metrics and evaluation operators. When concurrent writes or failed retries occur, existing indicator result records or evaluation operator result records are merged and updated to isolate the impact of changes in the number of retries in the statistical scope.

4. The dataset quality evaluation method based on a hierarchical indicator system according to claim 2, characterized in that, The execution of the evaluation plan and the invocation of evaluation operators to generate structured evaluation records include: According to the evaluation plan, the evaluation operators bound to each level of indicators are called through the multi-level indicator execution nodes to evaluate each sample unit; wherein, the sample unit corresponds to the basic unit that can be independently evaluated in the evaluation object of the task to be evaluated; The evaluation operator outputs a structured evaluation record, which includes at least: sample location information, indicator identifier, operator identifier, problem label, evidence pointer, execution status and time consumption information; the sample location information and problem label constitute a unique problem identifier; the evidence pointer is used to establish a verifiable association between the hit result corresponding to the problem label and the sample unit position indicated by the sample location information.

5. The dataset quality evaluation method based on a hierarchical indicator system according to claim 2, characterized in that, The step of storing the structured evaluation records into the evidence chain according to the indicator hierarchy includes: The structured evaluation records are written into the evidence chain in a columnar structured manner for storage; The structured evaluation records stored in the evidence chain are organized according to a preset hierarchical path. The indicator levels in the hierarchical path correspond to the indicator levels in the indicator tree. The nodes of the hierarchical path include, in sequence: task, first-level indicator, second-level indicator, third-level indicator, and operator.

6. The dataset quality evaluation method based on a hierarchical indicator system according to claim 4, characterized in that, The deduplication and statistical analysis of structured evaluation records based on the content of the evidence chain, generating tag statistical results, include: Extract sample location information and issue tags from the current chain of evidence; The combination of sample location information and question labels is used as the deduplication key to perform deduplication statistics on the structured evaluation records, so that the same question in the same sample unit is counted only once, and the label statistics results are generated.

7. The dataset quality evaluation method based on a hierarchical indicator system according to claim 4, characterized in that, The calculation of the stratified scores and total scores for each indicator level based on the statistical results of the tags includes: The third-level indicator execution node obtains the theoretical maximum number of detectable inspection items and the actual number of correct inspection items based on the label statistical results, and obtains the third-level indicator score based on the theoretical maximum number of detectable inspection items and the actual number of correct inspection items; wherein, the theoretical maximum number of detectable inspection items is calculated based on the sample size of the sample unit and the number of rejection items, and the sample size of the sample unit is obtained based on the label statistical results. The second-level indicator execution node performs weighted aggregation on the scores of the third-level indicators according to the indicator weights to obtain the second-level indicator hierarchical scores. The first-level indicator stratified score is obtained by weighting and aggregating the stratified scores of the second-level indicator according to the indicator weights through the first-level indicator execution node. The scores of each first-level indicator are integrated through the upper-level indicator aggregation node, and the total score is calculated through the result aggregation node based on the integration results of the upper-level indicator aggregation node.

8. The dataset quality evaluation method based on a hierarchical indicator system according to claim 4, characterized in that, The structured evaluation record also includes an evidence summary, which is used to replace the evidence pointer when the evidence pointer cannot be saved or is invalid; The evidence digest includes at least sample location information, hit fields or paths, offset information of hit segments, and feature identifiers for consistency verification.

9. A dataset quality evaluation device based on a hierarchical indicator system, characterized in that, include: Evaluation scheme acquisition module: used to acquire the evaluation scheme and hierarchical indicator system corresponding to the evaluation target; wherein, the hierarchical indicator system includes multi-level indicators; the evaluation scheme is used to configure indicator weights, rejection judgment rules, and determine the activation relationship between the hierarchical indicator system and the evaluation operators; Evaluation plan acquisition module: used to generate an executable evaluation plan based on the evaluation scheme; Evaluation record acquisition module: used to execute the evaluation plan and call the evaluation operator to generate structured evaluation records; wherein, the structured evaluation records follow the uniqueness constraint: the same evaluation task corresponds to a unique evaluation record in both the task and indicator dimensions and the task, indicator and operator dimensions; Evidence Chain Update Module: Used to store the structured evaluation records into the evidence chain according to the indicator hierarchy path; Tag statistics result acquisition module: used to perform deduplication and statistical analysis on structured evaluation records based on the content in the evidence chain, and generate tag statistics results; Evaluation score acquisition module: used to calculate the hierarchical score and total score of each indicator level based on the statistical results of the tags, and output the rejection conclusion and its associated evidence index when the rejection judgment rule is met; wherein, the total score is obtained based on the indicator weight.

10. A computer storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the dataset quality evaluation method based on a hierarchical indicator system as described in any one of claims 1-8.