A multi-agent based scientific data quality auditing method and system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a multi-agent system, the problems of task-level collaboration and security isolation in scientific data review are solved, achieving efficient and secure multimodal data review, providing structured evaluation reports and traceability capabilities, and meeting the requirements for scientific research data submission.

CN122198512APending Publication Date: 2026-06-12UNIV OF SCI & TECH BEIJING +1

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: UNIV OF SCI & TECH BEIJING
Filing Date: 2026-03-17
Publication Date: 2026-06-12

AI Technical Summary

⚠Technical Problem

Existing scientific data review technologies lack multi-agent systems, making it impossible to achieve task-level collaboration. This results in incomplete verification coverage or redundant execution, limiting system scalability and adaptability. Furthermore, the lack of security isolation and full-process operation logs makes it impossible to meet the data submission requirements of scientific research projects.

⚗Method used

A multi-agent system is constructed, which realizes the structural and semantic analysis of scientific data through task scheduling, audit algorithm generation, task execution and quantitative evaluation functional units, generates sub-task sequences with logical dependencies, executes audit algorithms in a constrained execution environment, outputs structured evaluation reports, and collects full-link traceability data in real time.

🎯Benefits of technology

It enhances the planning capabilities for review tasks, improves the accuracy of multimodal data consistency judgment, strengthens system security and stability, provides accurate traceability capabilities and structured evaluation results, and supports efficient and secure review of scientific research data.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122198512A_ABST

Patent Text Reader

Abstract

The application discloses a kind of scientific data quality auditing method and system based on multi-agent.It belongs to the field of data quality management and intelligent auditing.The method realizes the following by constructing the multi-agent system including task scheduling, algorithm generation, task execution and quantitative evaluation unit: by the semantic analysis and logic clustering of data from task scheduling agent, generate the audit plan with dependency relationship;Algorithm generation agent retrieves or dynamically generates algorithm from knowledge base based on multi-dimensional evaluation index;Task execution agent runs algorithm in restricted isolated environment and obtains original result containing confidence;Quantitative evaluation agent outputs structured report through conflict resolution and weighted aggregation.Meanwhile, the system monitors the execution state in real time, triggers adaptive correction when the threshold is reached, and identifies significant logical patterns using full-link traceability data. Through the comprehensive scoring mechanism, the verified rules are automatically deposited into the knowledge base.The application solves the problem of multi-source heterogeneous, complex and variable rules in scientific data quality auditing, and realizes adaptive scheduling, intelligent execution and knowledge closed-loop update in the auditing process.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data quality management and intelligent auditing, and in particular to a scientific data quality auditing method and system based on multi-agent systems. Background Technology

[0002] In scientific research project management, such as projects supported by the National Key Research and Development Program and the National Natural Science Foundation of China, research teams are required to standardize and submit all types of scientific data generated during project execution, in accordance with national scientific data management policies. Scientific data mainly refers to data generated in the fields of natural sciences and engineering technology through basic research, applied research, and experimental development, as well as raw data and derived data obtained through observation, monitoring, investigation, and testing and used in scientific research activities. For example, in materials science, scientific data includes, but is not limited to, original experimental records, metallographic microscopic images, XRD spectra, first-principles calculations or molecular dynamics simulation outputs such as VASP, performance test results, relevant academic papers, and patents.

[0003] Current mainstream scientific data auditing technologies mainly include the traditional model relying on human experience for item-by-item verification, the model relying on domain expert experience to judge the accuracy, completeness, consistency, reasonableness, compliance, and usability of data item by item, and the second-best automated verification system based on rule engines. Automated verification systems use preset static rules such as completeness checks, format validation, and field constraints to perform batch screening of data. In recent years, some systems have begun to introduce natural language processing or machine learning models to identify key information or simple logical errors in scientific data. However, overall, the verification is still mainly isolated and single-point, lacking the ability to comprehensively analyze the semantic relationships and contextual dependencies between multi-source data.

[0004] Furthermore, the scientific field is characterized by multi-stage coupling in experimental procedures, interdependence between computational simulation and characterization test results, and highly heterogeneous data types. Single intelligent agents have significant limitations in task partitioning, cross-modal consistency judgment, and complex contextual reasoning. (1) Single agents have difficulty forming task-level dynamic review planning when faced with multi-type, highly correlated scientific data; (2) A single agent is not able to perform multimodal inspection tasks such as text, images, spectra, experimental data, and computational simulation data at the same time, resulting in incomplete audit coverage or redundant execution. (3) It is difficult for a single agent to schedule different auditing algorithms in a secure isolation environment, which limits the system's scalability and adaptability.

[0005] Therefore, scientific data auditing urgently requires a multi-agent system oriented towards task collaboration. In this system, agents are functional units implemented through computer programs, capable of sensing input data, invoking one or more algorithms for computational reasoning, and outputting results. Multiple agents, through role division, upstream and downstream dependency management, and task-level scheduling, achieve comprehensive auditing of data integrity, format standardization, semantic consistency, and cross-source correlation.

[0006] Existing automation solutions lack a task-level intelligent scheduling mechanism tailored to the characteristics of scientific data. They cannot generate dependent review sub-task processes based on data content and project context, resulting in incomplete verification coverage or redundant execution. Review algorithms are usually hard-coded into the system, making it difficult to dynamically match and deploy them on demand. This limits the system's scalability and adaptability. Furthermore, the execution environment lacks effective security isolation, posing a risk of data leakage or system risks due to algorithm anomalies or malicious code. The review process lacks structured records of full-process operation logs and data status, making it impossible to accurately trace the source of problems and conduct audit verification. It also fails to meet the data quality review requirements during the data submission process of scientific research projects. Summary of the Invention

[0007] In view of the aforementioned existing problems, the present invention is proposed.

[0008] Therefore, this invention provides a scientific data quality auditing method based on multi-agent technology to address the problems of existing automation solutions lacking a task-level intelligent scheduling mechanism oriented towards the characteristics of scientific data, failing to generate auditing sub-task processes with dependencies based on scientific data content and project context, resulting in incomplete verification coverage or redundant execution, and auditing algorithms are usually hard-coded in the system, making it difficult to dynamically match and deploy on demand, thus limiting the scalability and adaptability of the system.

[0009] To solve the above-mentioned technical problems, the present invention provides the following technical solution: In a first aspect, the present invention provides a scientific data quality auditing method based on multi-agent systems, comprising: Construct a multi-agent system that includes at least task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional units; Receive scientific data and its associated project context information, perform structural and semantic analysis on the scientific data and project context information, generate a sequence of sub-tasks with logical dependencies based on the analysis results, and form a corresponding review plan; Based on the type and data characteristics of each subtask in the subtask sequence, an auditing algorithm adapted to the subtask is generated by matching or dynamically combining data from the knowledge base. The audit algorithm is executed in a restricted execution environment to obtain the audit results corresponding to each subtask. The audit results are normalized and weighted based on multi-dimensional weights to output a structured material data quality evaluation report. The system collects end-to-end traceability data generated by the interaction of various intelligent agents in real time, identifies audit logic patterns that meet preset conditions based on the traceability data, and automatically stores the verified audit rules in the knowledge base for subsequent material data quality audits.

[0010] As a preferred embodiment of the multi-agent-based scientific data quality auditing method of the present invention, the method comprises: constructing a multi-agent system including at least task scheduling, auditing algorithm generation, task execution, and quantitative evaluation functional units, including: Scientific data mainly refers to data generated in the fields of natural sciences, engineering and technology through basic research, applied research, and experimental development, as well as raw data and derived data obtained through observation, monitoring, investigation, testing and detection and used in scientific research activities.

[0011] In a multi-agent system, each task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional unit is implemented in the form of an intelligent agent unit with autonomous perception, decision-making, and execution capabilities. The intelligent agent unit can be deployed as an independent operating entity or a distributed collaborative node. Each intelligent agent unit dynamically calls a preset knowledge base or generative model based on the subject domain characteristics of scientific data, and constructs a logical review chain through serial, parallel or recursive nesting arrangement. Each intelligent agent unit is composed of one or more of mathematical statistics algorithms, rule-driven algorithms, and artificial intelligence algorithms, and is carried and constituted through software logic, software systems, or specific functional modules.

[0012] Intelligent agent units share data flow metadata containing semantic tags through a unified message bus to achieve cross-dimensional logical consistency verification between upstream and downstream intelligent agent units; During the audit execution process, the execution status and result confidence are monitored in real time. When an execution anomaly is detected or the confidence of the output result is lower than a preset threshold, at least one intelligent agent unit that performs the task scheduling function performs adaptive correction based on the full-link traceability information, including secondary decomposition of subtasks, algorithm rematching or parameter reconfiguration, and updates the audit task plan and corresponding execution process accordingly. The updated audit task plan is then handed over to the intelligent agent unit that performs the task execution function for execution again.

[0013] As a preferred embodiment of the multi-agent-based scientific data quality auditing method of the present invention, the method includes: receiving scientific data and its associated project context information; performing structural and semantic analysis on the scientific data and project context information; generating a sequence of sub-tasks with logical dependencies based on the analysis results; and forming a corresponding auditing plan. The specific steps are as follows: Obtain scientific data sets submitted by user terminals or upstream business systems through standardized data access interfaces. and related project context information; scientific data sets The project context information is input into the task scheduling agent, which then processes the task scheduling agent according to a preset set of quality dimension rules. Perform structural analysis, field mapping, semantic consistency judgment, and rule matching on scientific datasets; Based on the rule matching results, several logical units with independent review objectives are identified, denoted as the subtask set. ; The rule matching includes: A pre-trained large language model based on a preset template library, standard operator set, or specific prompt word engineering is used as the decision core of the task scheduling agent. Scientific data and project context information are jointly input into the rule engine. For each triggered rule, the pre-trained language model is used to extract multi-dimensional features of rule semantics, data attributes, and contextual environment to generate an audit intent node containing semantic vector features. Its characteristics are ,in For rule text features, For scientific data characteristics, The project context information features are then used; subsequently, the audit intent nodes are calculated. and Vector cosine similarity between This is used to quantify the semantic association strength between nodes and to construct a comprehensive association matrix by combining data dependencies; Based on the comprehensive correlation matrix, clustering algorithms are used to cluster the review intent nodes into several logical clusters. ,in The number of logical clusters obtained after clustering is denoted as , where each logical cluster corresponds to a subtask with an independent audit objective, and is recorded as a logical unit.

[0014] Build a dependency graph between subtasks Edge set in dependency graph This indicates a constraint on the order of execution, ensuring... It is a directed graph; Dependency graph As the output of the subtask process review plan, and for each subtask Associate the corresponding review strategy identifier Input data range and expected output type This leads to the development of an audit task plan.

[0015] As a preferred embodiment of the multi-agent-based scientific data quality auditing method of the present invention, the method includes: generating an auditing algorithm adapted to the sub-tasks by matching or dynamically combining data from a knowledge base based on the type and data characteristics of each sub-task in the sub-task sequence, comprising: Read the audit task plan and the metadata of each node within it, starting from the sub-task nodes. Extract the associated review policy identifier. Input data range and expected output type This forms the algorithm request parameters; Each subtask Audit strategy identifier Input data range and expected output type Encapsulate it as algorithm request parameters and use an audit algorithm retrieval mechanism for retrieval; The retrieval mechanism of the auditing algorithm includes: The audit algorithm generates an intelligent agent that queries the built-in audit algorithm library. Review each algorithm in the algorithm library All are marked with applicable strategy identifier sets. Supported data type sets and output format ; Select the one that satisfies Algorithm As a matching result; If no candidate algorithm meets the above conditions, the trained large model algorithm in the audit algorithm generation module is used to generate an agent based on the current subtask. Review strategy identifier Input data range and expected output type Generate audit algorithm implementation logic or executable algorithm program for executing the audit strategy.

[0016] When multiple candidate algorithms meet the conditions, calculate the algorithm for each candidate algorithm. Overall assessment score The expression is: ; in, For the algorithm Average execution time For the algorithm Resource consumption and These are the normalized weighting coefficients; Select score The highest-scoring algorithm is used as the final matching result. If the scores are the same, the algorithm with the higher historical accuracy is selected first. As a preferred embodiment of the multi-agent-based scientific data quality auditing method of the present invention, wherein: the auditing algorithm is executed in a constrained execution environment to obtain the auditing results corresponding to each subtask, including: for the current subtask Create a restricted execution environment or isolated computing unit, restricting the current subtask's file system, network, and system call permissions, and applying a matching auditing algorithm. And the algorithm dependencies are loaded into the restricted execution environment; Input data fragments corresponding to subtasks Inject into a secure sandbox container as algorithm input; Start the audit algorithm Execute the quality verification logic for each subtask. The associated scientific data undergoes at least one data quality audit operation, which includes checking one or more of the following: accuracy, completeness, consistency, reasonableness, compliance, or usability of the data, and outputting the original verification results. Including the location of the problem Problem Types Confidence score ; Will Encapsulate the raw results of the subtasks into a standardized format, and then aggregate all the raw results of the subtasks into a single dataset. ; As a preferred embodiment of the multi-agent-based scientific data quality review method of the present invention, the review results are normalized and weighted based on multi-dimensional weights to output a structured material data quality evaluation report, including: raw result data set of subtasks The problem location identifiers in the map are normalized and mapped to eliminate coordinate ambiguity caused by different subtasks; To resolve conflicts among multiple problematic results from the same data location, a weighted voting mechanism is used to calculate the overall confidence level. The expression is: in To point to the first A set of all subtask indexes for each data location. For subtasks Authority and weight Rate its confidence level; According to the preset quality dimension mapping table , the problem type Mapped to at least one of the quality dimensions; For each dimension Calculate the quality score for each dimension. The formula is: ; in, For belonging to dimension The set of problem locations For the first Subtask indexes for each data location; Introducing Dimension Importance Coefficient Calculate the overall quality index: ; Scores for each dimension Overall quality index The detailed list of issues is integrated into a structured and quantitative evaluation report.

[0017] As a preferred embodiment of the multi-agent-based scientific data quality auditing method of the present invention, the method includes: real-time collection of end-to-end traceability data generated by the interaction of various agents, identification of audit logic patterns that meet preset conditions based on the traceability data, and automatic storage of the verified audit rules in a knowledge base for subsequent material data quality auditing. The specific steps are as follows: During the task scheduling phase, the traceability agent collects, records, and reviews the task plan. Generation timestamp Input data hash value and task scheduling agent identifier ; During the task execution phase, the traceability agent records each subtask. scheduling time Algorithm matching results Sandbox startup time Original results and implement intelligent agent identification ; The source tracing agent will fuse and process the parameters. Dimensional mapping relationship ,coefficient and report generation time Organized chronologically as an event chain ; event chain With data version snapshots and task dependency graphs as the core, an immutable end-to-end traceability archive is constructed. The full-chain traceability archive supports reverse backtracking queries based on time, task nodes, or data objects. After the review task is completed, based on the full-link traceability archive and historical review results, the salience review logic pattern is solidified. Confirmation signals from users are received automatically or through a human-computer interaction interface according to a preset strategy, and the confirmed rules or algorithms are solidified into the review knowledge base, including: Within a pre-defined historical review window, the algorithm-extended intelligent agent statistically analyzes various review logic patterns. Trigger frequency in historical review tasks And calculate the trigger frequency of the corresponding mode in the current review task. ; The expression for calculating the difference in trigger frequency for the corresponding mode is as follows: When the difference Greater than the preset threshold When this happens, the corresponding calculation rule or algorithm is determined to be a saliency rule or algorithm to be solidified; For the identified saliency candidate rules or algorithms, corresponding candidate rules or algorithms are generated, and a comprehensive score of coverage, contribution, and stability is calculated for each candidate rule or algorithm. in These are the weighting coefficients. The candidate rule represents the sample coverage of the current task data, while the rule hits a portion of the total data volume. represents the contribution of the candidate rule to the improvement of data quality, and represents the gain of the quality index before and after applying the rule. To ensure the consistency of the candidate rule's performance within the historical window; The knowledge base is a resource library for storing scientific data quality review logic, which contains a set of review rules, algorithm operators, or logical models; The audit logic is implemented through one or more combinations of analysis logic, calculation logic, and judgment logic, and can be invoked by the task execution stage or subsequent audit tasks.

[0018] When the comprehensive score of candidate rules or algorithms Not less than the preset threshold When necessary, the candidate review rules or algorithms and their associated source tracing evidence are pushed to the user terminal through the human-computer interaction interface; Secondly, the present invention provides a scientific data quality auditing system based on multi-agent systems, comprising: The module includes a task scheduling module, a task execution module, an audit algorithm generation module, a quantitative evaluation module, a full-link traceability module, and a knowledge base expansion module. The task scheduling module is configured with a task scheduling intelligent agent, which is used to perform structural and semantic analysis on scientific data and generate audit task plans with dependencies through feature extraction and logical clustering. The audit algorithm generation module has a built-in audit algorithm generation intelligent agent, which is used to retrieve the appropriate algorithm from the knowledge base based on task data, or to call the generative intelligent agent to dynamically construct the execution logic. The task execution module is configured with a task execution agent and a secure execution environment, which is used to run the audit algorithm in an isolated secure execution environment and output standardized results including location, type and confidence level. The quantitative evaluation module is configured with a quantitative evaluation intelligent agent, which is used to normalize and map multi-source results and fuse them with weights to generate a multi-dimensional structured evaluation report. The full-link traceability module is configured with traceability intelligent agents to record the operation logs and data flow information of each intelligent agent unit and construct a full-link traceability archive. The knowledge base extension module is configured with an algorithm extension intelligent agent, which automatically solidifies the review rules based on historical pattern matching and comprehensive scoring. The supporting architecture of the multi-agent system also includes a unified message bus and a state monitoring unit, which are used for data sharing among agents and end-to-end adaptive task rescheduling.

[0019] Thirdly, the present invention provides a computer device including a memory and a processor, wherein the memory stores a computer program, wherein when the computer program is executed by the processor, it implements any step of the multi-agent-based scientific data quality auditing method as described in the first aspect of the present invention.

[0020] Fourthly, the present invention provides a computer-readable storage medium having a computer program stored thereon, wherein: when the computer program is executed by a processor, it implements any step of the multi-agent-based scientific data quality auditing method as described in the first aspect of the present invention.

[0021] The beneficial effects of this invention are as follows: By constructing a task-level intelligent scheduling mechanism based on multi-agent collaboration, the system can dynamically generate sub-task sequences with dependencies according to the content and context of scientific data, thereby avoiding the problems of incomplete verification coverage or redundant execution in traditional solutions, and significantly improving the planning capability of review tasks in complex scientific research scenarios. By generating agents through the review algorithm to retrieve or dynamically combine knowledge bases to generate execution logic, deep adaptation of the review algorithm to multimodal and heterogeneous scientific data features is achieved, thus solving the problem of poor scalability caused by hard-coding the review algorithm and improving the accuracy of cross-modal data consistency judgment. By introducing logical clustering and feature extraction techniques in the task scheduling stage, multi-dimensional review intentions are transformed into independent sub-task units, and a dependency graph is established, thereby ensuring the logical rigor and execution efficiency of the review process. By running the review algorithm in an isolated and restricted execution environment and strictly limiting system calls and external interface access, the risk of data leakage caused by algorithm anomalies or malicious code is effectively prevented, enhancing the security and stability of the system when processing sensitive scientific research data. By introducing a weighted voting mechanism and conflict resolution algorithm into the quantitative evaluation intelligent agent, the results of multi-source sub-tasks are normalized and weighted, thereby elevating the review results from single issue labels to multi-dimensional structured scores with confidence support, enhancing the objectivity and interpretability of the evaluation results. Through real-time collection of end-to-end traceability data and the construction of an immutable event chain, precise reverse backtracking of the review process from data access to result output is achieved, providing complete and transparent technical support for the audit verification, issue tracing, and reproduction of scientific research data. Through saliency logical pattern recognition and a comprehensive scoring mechanism based on historical traceability archives, validated review experience is automatically accumulated in the knowledge base, enabling the system to continuously evolve and improve the shortcomings of insufficient coverage and outdated rules in traditional manual review in long-term application scenarios. Attached Figure Description

[0022] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0023] Figure 1 This is a flowchart of the multi-agent-based scientific data quality auditing method in Example 1.

[0024] Figure 2 This is a schematic diagram of the multi-agent-based scientific data quality auditing system in Example 1. Detailed Implementation

[0025] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

[0026] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein, and those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific embodiments disclosed below.

[0027] Secondly, the term "one embodiment" or "embodiment" as used herein refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. The phrase "in one embodiment" appearing in different places in this specification does not necessarily refer to the same embodiment, nor is it a single or selective embodiment that is mutually exclusive with other embodiments.

[0028] Example, refer to Figure 1 and Figure 2 This embodiment of the invention provides a scientific data quality auditing method based on multi-agent systems, comprising the following steps: S1. Construct a multi-agent system that includes at least task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional units.

[0029] Furthermore, the scientific data referred to mainly refers to data generated in the fields of natural sciences, engineering and technology sciences through basic research, applied research, experimental development, etc., as well as raw data and its derivative data obtained through observation and monitoring, investigation and survey, testing and detection and used for scientific research activities.

[0030] In the multi-agent system, each task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional unit is implemented in the form of an intelligent agent unit with autonomous perception, decision-making, and execution capabilities. The intelligent agent unit can be deployed as an independent operating entity or a distributed collaborative node. Each intelligent agent unit dynamically calls a preset knowledge base or generative model based on the subject domain characteristics of scientific data, and constructs a logical audit chain through serial, parallel or recursive nesting arrangement. Each intelligent agent unit is composed of one or more of mathematical statistics algorithms, rule-driven algorithms, and artificial intelligence algorithms, and is carried and constituted through software logic, software systems, or specific functional modules.

[0031] The intelligent agent units share data flow metadata containing semantic tags through a unified message bus to realize cross-dimensional logical consistency verification between upstream and downstream intelligent agent units; During the audit execution process, the execution status and result confidence are monitored in real time. When an execution anomaly is detected or the confidence of the output result is lower than a preset threshold, at least one intelligent agent unit that performs the task scheduling function performs adaptive correction based on the full-link traceability information, including secondary decomposition of subtasks, algorithm rematching or parameter reconfiguration, and updates the audit task plan and corresponding execution process accordingly. The updated audit task plan is then handed over to the intelligent agent unit that performs the task execution function for execution again.

[0032] It should be noted that by constructing a multi-agent system with autonomous perception and adaptive correction capabilities, deeply integrating generative models and deterministic algorithms, and achieving full-link logical collaboration based on semantic metadata, it is possible to perform intelligent structural deconstruction and dynamic review of scientific data in different disciplines. While improving review efficiency, it ensures the traceability and quality compliance of the data output process of complex scientific research activities.

[0033] S2. Receive scientific data and its associated project context information; perform structural and semantic analysis on the scientific data and project context information; generate a sequence of sub-tasks with logical dependencies based on the analysis results; and form a corresponding review plan. The specific steps are as follows: Furthermore, by using standardized data access interfaces, scientific data sets submitted by user terminals or upstream business systems can be obtained. and related project context information; scientific data sets The project context information is input into the task scheduling agent, which then processes the task scheduling agent according to a preset set of quality dimension rules. Perform structural analysis, field mapping, semantic consistency judgment, and rule matching on scientific datasets; Based on the rule matching results, several logical units with independent review objectives are identified, denoted as the subtask set. ; The rule matching includes: A pre-trained large language model based on a preset template library, standard operator set, or specific prompt word engineering is used as the decision core of the task scheduling agent. Scientific data and project context information are jointly input into the rule engine. For each triggered rule, the pre-trained language model is used to extract multi-dimensional features of rule semantics, data attributes, and contextual environment to generate an audit intent node containing semantic vector features. Its characteristics are ,in For rule text features, For scientific data characteristics, The project context information features are then used; subsequently, the audit intent nodes are calculated. and Vector cosine similarity between This is used to quantify the semantic association strength between nodes and to construct a comprehensive association matrix by combining data dependencies; Based on the comprehensive correlation matrix, clustering algorithms are used to cluster the review intent nodes into several logical clusters. ,in The number of logical clusters obtained after clustering is denoted as , where each logical cluster corresponds to a subtask with an independent audit objective, and is recorded as a logical unit.

[0034] Build a dependency graph between subtasks Edge set in dependency graph This indicates a constraint on the order of execution, ensuring... It is a directed graph; Dependency graph As the output of the subtask process review plan, and for each subtask Associate the corresponding review strategy identifier Input data range and expected output type This leads to the development of an audit task plan.

[0035] It should be noted that by structurally analyzing scientific data and project context, quantifying the correlation strength between review intentions using a vector space model, and constructing a directed graph review plan with sequential constraints, the intelligent transformation of review tasks from "receiving raw materials" to "executing structured logic" is achieved. This not only improves the accuracy of review task allocation but also lays a solid foundation for end-to-end quality traceability and adaptive correction by pre-setting strategy identifiers and input / output specifications for each sub-task.

[0036] S3. Based on the type and data characteristics of each subtask in the subtask sequence, a review algorithm adapted to the subtask is generated by matching or dynamically combining data from the knowledge base, including: Furthermore, the audit task plan and the metadata of each node within it are read, starting from the sub-task nodes. Extract the associated review policy identifier. Input data range and expected output type This forms the algorithm request parameters; Each subtask Audit strategy identifier Input data range and expected output type Encapsulate it as algorithm request parameters and use an audit algorithm retrieval mechanism for retrieval; The retrieval mechanism of the auditing algorithm includes: The audit algorithm generates an intelligent agent that queries the built-in audit algorithm library. Review each algorithm in the algorithm library All are marked with applicable strategy identifier sets. Supported data type sets and output format ; Select the one that satisfies Algorithm As a matching result; If no candidate algorithm meets the above conditions, the trained large model algorithm in the audit algorithm generation module is used to generate an agent based on the current subtask. Review strategy identifier Input data range and expected output type Generate audit algorithm implementation logic or executable algorithm program for executing the audit strategy, or directly specify the structured access path of data in the file for the audit code to read the data directly; The generation process includes: By leveraging pre-trained programming language models combined with domain knowledge bases and typical verification pattern templates, functional or data structure-oriented verification logic is generated to ensure the algorithm's accuracy. Can be read or parsed directly; Perform syntax and execution security checks on the generation algorithm to ensure it can run correctly in a secure sandbox environment. If code errors are detected, [the algorithm will be executed]. The code generation process can be executed again, and... Provided for reference in large models; The large model algorithm generates intelligent agents based on pre-trained programming language models, combined with domain knowledge bases and typical verification pattern templates, to generate functional verification logic that conforms to the specifications. The generation algorithm, along with the corresponding rule file, regular expression template, and external resource reference list, are encapsulated into an executable container image, and a unique access identifier is registered for the task execution agent to call. When multiple candidate algorithms meet the conditions, calculate the algorithm for each candidate algorithm. Overall assessment score The expression is: ; in, For the algorithm Average execution time For the algorithm Resource consumption and To normalize the weighting coefficients, the weighting coefficients are determined based on the accuracy requirements of the actual review scenario through pre-verification or heuristic search. Select score The highest-scoring algorithm is used as the final matching result. If the scores are the same, the algorithm with the higher historical accuracy is selected first. It should be noted that a triple-based driving mechanism was adopted to construct a complete process from algorithm retrieval and quantitative evaluation to dynamic algorithm generation. The system prioritizes the selection of high-efficiency algorithms from the knowledge base, and automatically constructs and verifies new logic through large models when necessary. This model enables on-demand access and rapid expansion of review capabilities, effectively addressing the challenges posed by diverse data characteristics while ensuring consistency of results.

[0037] S4. Execute the audit algorithm in a restricted execution environment to obtain the audit results corresponding to each subtask.

[0038] Furthermore, for the current subtask Create a restricted execution environment or isolated computing unit, restricting the current subtask's file system, network, and system call permissions, and applying a matching auditing algorithm. And the algorithm dependencies are loaded into the restricted execution environment; Input data fragments corresponding to subtasks Inject into a secure sandbox container as algorithm input; Start the audit algorithm Execute the quality verification logic for each subtask. The associated scientific data undergoes at least one data quality audit operation, which includes checking one or more of the following: accuracy, completeness, consistency, reasonableness, compliance, or usability of the data, and outputting the original verification results. Including the location of the problem Problem Types Confidence score ; Will Encapsulate the raw results of the subtasks into a standardized format, and then aggregate all the raw results of the subtasks into a single dataset. ; It should be noted that the task execution agent, by constructing a constrained execution environment, achieves refined verification of complex scientific data while ensuring algorithm isolation and system security. This mechanism, by decoupling input data fragments from standardized algorithm logic, supports in-depth data quality analysis from multiple dimensions, including accuracy and completeness. Simultaneously, the output adopts a "position-type-confidence" triple structure, which not only unifies the feedback format of heterogeneous algorithms but also provides a quantitative basis for the reliability of the review results through confidence scoring, laying a standardized foundation for the final result aggregation and decision-making.

[0039] S5. Normalize the audit results and perform weighted summarization based on multi-dimensional weights to output a structured material data quality evaluation report.

[0040] Furthermore, regarding the original result data set of the subtasks... The problem location identifiers in the map are normalized and mapped to eliminate coordinate ambiguity caused by different subtasks; To resolve conflicts among multiple problematic results from the same data location, a weighted voting mechanism is used to calculate the overall confidence level. The expression is: in To point to the first A set of all subtask indexes for each data location. For subtasks Authority and weight Rate its confidence level; According to the preset quality dimension mapping table , the problem type Mapped to at least one of the quality dimensions; For each dimension Calculate the quality score for each dimension. The formula is: ; in, For belonging to dimension The set of problem locations For the first Subtask indexes for each data location; Introducing Dimension Importance Coefficient Calculate the overall quality index: ; It should be noted that by normalizing the location of the results of multi-source sub-tasks, resolving conflicts, and quantifying multi-dimensional indicators, the ambiguity and contradictions caused by heterogeneous verification can be eliminated, generating a comprehensive, objective, and clearly hierarchical quality evaluation report, thereby improving the credibility of the audit conclusions and their value in supporting decision-making.

[0041] S6. Collect the full-link traceability data generated by the interaction of each intelligent agent in real time, and identify the audit logic mode that meets the preset conditions based on the traceability data. Automatically store the verified audit rules into the knowledge base for subsequent material data quality audit.

[0042] Furthermore, during the task scheduling phase, the traceability agent collects, records, and reviews the task plan. Generation timestamp Input data hash value and task scheduling agent identifier ; During the task execution phase, the traceability agent records each subtask. scheduling time Algorithm matching results Sandbox startup time Original results and implement intelligent agent identification ; The source tracing agent will fuse and process the parameters. Dimensional mapping relationship ,coefficient and report generation time Organized chronologically as an event chain ; event chain With data version snapshots and task dependency graphs as the core, an immutable end-to-end traceability archive is constructed. The full-chain traceability archive supports reverse backtracking queries based on time, task nodes, or data objects; After the review task is completed, based on the full-link traceability archive and historical review results, the salience review logic pattern is solidified. Confirmation signals from users are received automatically or through a human-computer interaction interface according to a preset strategy, and the confirmed rules or algorithms are solidified into the review knowledge base, including: Within a pre-defined historical review window, the algorithm-extended intelligent agent statistically analyzes various review logic patterns. Trigger frequency in historical review tasks And calculate the trigger frequency of the corresponding mode in the current review task. ; The expression for calculating the difference in trigger frequency for the corresponding mode is as follows: When the difference Greater than the preset threshold When this happens, the corresponding calculation rule or algorithm is determined to be a saliency rule or algorithm to be solidified; For the identified saliency candidate rules or algorithms, corresponding candidate rules or algorithms are generated, and a comprehensive score of coverage, contribution, and stability is calculated for each candidate rule or algorithm. in These are the weighting coefficients. The candidate rule represents the sample coverage of the current task data, while the rule hits a portion of the total data volume. represents the contribution of the candidate rule to the improvement of data quality, and represents the gain of the quality index before and after applying the rule. To ensure the consistency of the candidate rule's performance within the historical window; The knowledge base is a resource library for storing scientific data quality review logic, which contains a set of review rules, algorithm operators, or logical models; The audit logic is implemented through one or more combinations of analysis logic, calculation logic, and judgment logic, and can be invoked by the task execution stage or subsequent audit tasks.

[0043] When the comprehensive score of candidate rules or algorithms Not less than the preset threshold When necessary, the candidate review rules or algorithms and their associated source tracing evidence are pushed to the user terminal through the human-computer interaction interface; It should be noted that by collecting key operation logs and data status throughout the entire process and constructing an event chain traceability archive, the system achieves full-element recording and full-process traceability of the audit process. This not only meets the needs of audit verification, problem localization, and responsibility definition in high-compliance scenarios, but also significantly improves the interpretability of audit results, process transparency, and operational efficiency by establishing an explicit correlation and visual linkage mechanism between quality issues and traceability events. Based on this, the system utilizes full-link traceability data to accurately identify significant rules or algorithms in a data-driven manner by statistically analyzing the trigger frequency of audit logic patterns within historical windows and quantifying the frequency offset of the current task. Combined with a comprehensive scoring function consisting of coverage, quality impact, and stability, and under the constraint of a manual confirmation mechanism, the system solidifies the rules, effectively avoiding the risks of misintroduction and overfitting that may occur during automatic evolution. This ensures that the audit knowledge base achieves continuous performance evolution, system optimization, and dynamic updates under controllable and interpretable conditions.

[0044] This embodiment also provides a scientific data quality auditing system based on multi-agent systems, including: The module includes a task scheduling module, a task execution module, an audit algorithm generation module, a quantitative evaluation module, a full-link traceability module, and a knowledge base expansion module. The task scheduling module is configured with a task scheduling intelligent agent, which is used to perform structural and semantic analysis on scientific data and generate audit task plans with dependencies through feature extraction and logical clustering. The audit algorithm generation module has a built-in audit algorithm generation intelligent agent, which is used to retrieve the appropriate algorithm from the knowledge base based on task data, or to call the generative intelligent agent to dynamically construct the execution logic. The task execution module is configured with a task execution agent and a secure execution environment, which is used to run the audit algorithm in an isolated secure execution environment and output standardized results including location, type and confidence level. The quantitative evaluation module is configured with a quantitative evaluation intelligent agent, which is used to normalize and map multi-source results and fuse them with weights to generate a multi-dimensional structured evaluation report. The full-link traceability module is configured with traceability intelligent agents to record the operation logs and data flow information of each intelligent agent unit and construct a full-link traceability archive. The knowledge base extension module is configured with an algorithm extension intelligent agent, which automatically solidifies the review rules based on historical pattern matching and comprehensive scoring. The supporting architecture of the multi-agent system also includes a unified message bus and a state monitoring unit, which are used for data sharing among agents and end-to-end adaptive task rescheduling.

[0045] This embodiment also provides a computer device applicable to the multi-agent-based scientific data quality auditing method, comprising: a memory and a processor; the memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions to implement the multi-agent-based scientific data quality auditing method proposed in the above embodiment.

[0046] The computer device can be a terminal, comprising a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, carrier networks, NFC (Near Field Communication), or other technologies. The display screen can be an LCD screen or an e-ink screen. The input devices can be a touch layer covering the display screen, buttons, a trackball, or a touchpad on the computer device's casing, or an external keyboard, touchpad, or mouse.

[0047] This embodiment also provides a storage medium storing a computer program that, when executed by a processor, implements the multi-agent-based scientific data quality auditing method proposed in the above embodiments. The storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Red-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.

[0048] In summary, this invention constructs a scientific data quality auditing method and system based on multi-agent collaboration. It utilizes a task scheduling agent to perform structural analysis and logical clustering of scientific data and its project context, generating auditing task plans with dependencies. This effectively solves the problems of missing task planning and incomplete verification coverage in traditional solutions. Combined with the auditing algorithm-generating agent's precise matching and dynamic logic construction of the knowledge base, it achieves highly adaptable auditing for multimodal heterogeneous data. Furthermore, in a constrained execution environment, a security isolation mechanism ensures the security of sensitive scientific data and the stability of system operation. By introducing a weighted fusion and conflict resolution algorithm through a quantitative evaluation agent, the results of multi-source sub-tasks are transformed into structured reports with confidence support. Simultaneously, a tracing agent records the entire operation log and constructs an immutable event chain. This not only achieves precise tracing and audit verification of the auditing process but also enables continuous evolution and self-improvement of the system logic through automatic identification and rule accumulation of significant auditing patterns. This provides an intelligent, process-oriented, and adaptive solution for scientific data quality management.

[0049] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.

Claims

1. A method for scientific data quality auditing based on multi-agent systems, characterized in that: include: Construct a multi-agent system that includes at least task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional units; Receive scientific data and its associated project context information, perform structural and semantic analysis on the scientific data and project context information, generate a sequence of sub-tasks with logical dependencies based on the analysis results, and form a corresponding review plan; Based on the type and data characteristics of each subtask in the subtask sequence, an auditing algorithm adapted to the subtask is generated by matching or dynamically combining data from the knowledge base. The audit algorithm is executed in a restricted execution environment to obtain the audit results corresponding to each subtask. The audit results are normalized and weighted based on multi-dimensional weights to output a structured material data quality evaluation report. The system collects end-to-end traceability data generated by the interaction of various intelligent agents in real time, identifies audit logic patterns that meet preset conditions based on the traceability data, and automatically stores the verified audit rules in the knowledge base for subsequent material data quality audits.

2. The scientific data quality auditing method based on multi-agent technology as described in claim 1, characterized in that: Construct a multi-agent system that includes at least task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional units, including: The scientific data referred to mainly refers to data generated in the fields of natural sciences, engineering and technology sciences through basic research, applied research, experimental development, etc., as well as raw data and its derivative data obtained through observation and monitoring, investigation and survey, testing and detection and used for scientific research activities; In the multi-agent system, each task scheduling, audit algorithm generation, task execution, and quantitative evaluation functional unit is implemented in the form of an intelligent agent unit with autonomous perception, decision-making, and execution capabilities. The intelligent agent unit can be deployed as an independent operating entity or a distributed collaborative node. Each intelligent agent unit dynamically calls a preset knowledge base or generative model based on the subject domain characteristics of scientific data, and constructs a logical audit chain through serial, parallel or recursive nesting arrangement. Each intelligent agent unit is composed of one or more of mathematical statistics algorithms, rule-driven algorithms, and artificial intelligence algorithms, and is carried and constituted by software logic, software systems, or specific functional modules. The intelligent agent units share data flow metadata containing semantic tags via a unified message bus to achieve cross-dimensional logical consistency verification between upstream and downstream intelligent agent units. During the audit execution process, the execution status and result confidence are monitored in real time. When an execution anomaly is detected or the confidence of the output result is lower than a preset threshold, at least one intelligent agent unit executing the task scheduling function performs adaptive correction based on the full-link traceability information, including secondary decomposition of subtasks, algorithm rematching or parameter reconfiguration, and updates the audit task plan and corresponding execution process accordingly. The updated audit task plan is then reassigned to the intelligent agent unit executing the task execution function for execution.

3. The scientific data quality auditing method based on multi-agent systems as described in claim 1, characterized in that: The process involves receiving scientific data and its associated project context information, performing structural and semantic analysis on the scientific data and project context information, generating a sequence of sub-tasks with logical dependencies based on the analysis results, and forming a corresponding review plan. The specific steps are as follows: Obtain scientific data sets submitted by user terminals or upstream business systems through standardized data access interfaces. and related project context information; scientific data sets The project context information is input into the task scheduling agent, which then processes the task scheduling agent according to a preset set of quality dimension rules. Perform structural analysis, field mapping, semantic consistency judgment, and rule matching on scientific datasets; Based on the rule matching results, several logical units with independent review objectives are identified, denoted as the subtask set. ; The rule matching includes: A pre-trained large language model based on a preset template library, standard operator set, or specific prompt word engineering is used as the decision core of the task scheduling agent. Scientific data and project context information are jointly input into the rule engine. For each triggered rule, the pre-trained language model is used to extract multi-dimensional features of rule semantics, data attributes, and contextual environment to generate an audit intent node containing semantic vector features. Its characteristics are ,in For rule text features, For scientific data characteristics, The project context information features are then used; subsequently, the audit intent nodes are calculated. and Vector cosine similarity between This is used to quantify the semantic association strength between nodes and to construct a comprehensive association matrix by combining data dependencies; Based on the comprehensive correlation matrix, clustering algorithms are used to cluster the review intent nodes into several logical clusters. ,in The number of logical clusters obtained after clustering, each logical cluster corresponds to a subtask with an independent audit objective, and is recorded as a logical unit; Build a dependency graph between subtasks Edge set in dependency graph This indicates a constraint on the order of execution, ensuring... It is a directed graph; Dependency graph As the output of the subtask process review plan, and for each subtask Associate the corresponding review strategy identifier Input data range and expected output type This leads to the development of an audit task plan.

4. The scientific data quality auditing method based on multi-agent technology as described in claim 1, characterized in that: The specific steps for generating an auditing algorithm adapted to the subtasks by matching or dynamically combining data from the knowledge base based on the type and data characteristics of each subtask in the subtask sequence are as follows: Read the audit task plan and the metadata of each node within it, starting from the sub-task nodes. Extract the associated review policy identifier. Input data range and expected output type This forms the algorithm request parameters; Each subtask Audit strategy identifier Input data range and expected output type Encapsulate it as algorithm request parameters and use an audit algorithm retrieval mechanism for retrieval; The retrieval mechanism of the auditing algorithm includes: The audit algorithm generates an intelligent agent that queries the built-in audit algorithm library. Review each algorithm in the algorithm library All are marked with applicable strategy identifier sets. Supported data type sets and output format ; Select the one that satisfies Algorithm As a matching result; If no candidate algorithm meets the above conditions, the trained large model algorithm in the audit algorithm generation module is used to generate an agent based on the current subtask. Review strategy identifier Input data range and expected output type Generate audit algorithm implementation logic or executable algorithm program for executing the audit strategy; When multiple candidate algorithms meet the conditions, calculate the algorithm for each candidate algorithm. Overall assessment score The expression is: ; in, For the algorithm Average execution time For the algorithm Resource consumption and These are the normalized weighting coefficients; Select score The highest-scoring algorithm is used as the final matching result. If the scores are the same, the algorithm with the higher historical accuracy is selected first.

5. The scientific data quality auditing method based on multi-agent technology as described in claim 1, characterized in that: The step of executing the audit algorithm in a restricted execution environment to obtain the audit results corresponding to each subtask includes: For the current subtask Create a restricted execution environment or isolated computing unit, restricting the current subtask's file system, network, and system call permissions, and applying a matching auditing algorithm. And the algorithm dependencies are loaded into the restricted execution environment; Input data fragments corresponding to subtasks Inject into a restricted execution environment as algorithm input; Start the audit algorithm Execute the quality verification logic for each subtask. The associated scientific data undergoes at least one data quality audit operation, which includes checking one or more of the following: accuracy, completeness, consistency, reasonableness, compliance, or usability of the data, and outputting the original verification results. Including the location of the problem Problem Types Confidence score ; Will Encapsulate the raw results of the subtasks into a standardized format, and then aggregate all the raw results of the subtasks into a single dataset. .

6. The scientific data quality auditing method based on multi-agent technology as described in claim 1, characterized in that: The review results are normalized and weighted based on multi-dimensional weights to output a structured material data quality evaluation report, including: raw result data set of subtasks The problem location identifiers in the map are normalized and mapped to eliminate coordinate ambiguity caused by different subtasks; To resolve conflicts among multiple problematic results from the same data location, a weighted voting mechanism is used to calculate the overall confidence level. The expression is: in To point to the first A set of all subtask indexes for each data location. For subtasks Authority and weight Rate its confidence level; According to the preset quality dimension mapping table , the problem type Mapped to at least one of the quality dimensions; For each dimension Calculate the quality score for each dimension. The formula is: ； in, For belonging to dimension The set of problem locations For the first Subtask indexes for each data location; Introducing Dimension Importance Coefficient Calculate the overall quality index: ； Scores for each dimension Overall quality index The detailed list of issues is integrated into a structured and quantitative evaluation report.

7. The scientific data quality auditing method based on multi-agent systems as described in claim 1, characterized in that: The process involves real-time collection of end-to-end traceability data generated by interactions between various intelligent agents, identification of audit logic patterns that meet preset conditions based on the traceability data, and automatic storage of verified audit rules in a knowledge base for subsequent material data quality audits. The specific steps are as follows: During the task scheduling phase, the traceability agent collects, records, and reviews the task plan. Generation timestamp Input data hash value and task scheduling agent identifier ; During the task execution phase, the traceability agent records each subtask. scheduling time Algorithm matching results Sandbox startup time Original results and implement intelligent agent identification ; The source tracing agent will fuse and process the parameters. Dimensional mapping relationship ,coefficient and report generation time Organized chronologically as an event chain ; event chain With data version snapshots and task dependency graphs as the core, an immutable end-to-end traceability archive is constructed. The full-chain traceability archive supports reverse backtracking queries based on time, task nodes, or data objects. After the review task is completed, based on the full-link traceability archive and historical review results, the salience review logic pattern is solidified. Confirmation signals from users are received automatically or through a human-computer interaction interface according to a preset strategy, and the confirmed rules or algorithms are solidified into the review knowledge base, including: Within a pre-defined historical review window, the algorithm-extended intelligent agent statistically analyzes various review logic patterns. Trigger frequency in historical review tasks And calculate the trigger frequency of the corresponding mode in the current review task. ; The expression for calculating the difference in trigger frequency for the corresponding mode is as follows: When the difference Greater than the preset threshold When this happens, the corresponding calculation rule or algorithm is determined to be a saliency rule or algorithm to be solidified; For the identified saliency candidate rules or algorithms, corresponding candidate rules or algorithms are generated, and a comprehensive score of coverage, contribution, and stability is calculated for each candidate rule or algorithm. in These are the weighting coefficients. The candidate rule represents the sample coverage of the current task data, while the rule hits a portion of the total data volume. represents the contribution of the candidate rule to the improvement of data quality, and represents the gain of the quality index before and after applying the rule. To ensure the consistency of the candidate rule's performance within the historical window; The knowledge base is a resource library for storing scientific data quality review logic, which contains a set of review rules, algorithm operators, or logical models; The audit logic is implemented through one or more combinations of analysis logic, calculation logic, and judgment logic, and can be invoked by the task execution stage or subsequent audit tasks. When the comprehensive score of candidate rules or algorithms Not less than the preset threshold When necessary, the candidate review rules or algorithms and their associated source tracing evidence are pushed to the user terminal through the human-computer interaction interface.

8. A multi-agent-based scientific data quality auditing system, based on the multi-agent-based scientific data quality auditing method according to any one of claims 1 to 7, characterized in that: include: The module includes a task scheduling module, a task execution module, an audit algorithm generation module, a quantitative evaluation module, a full-link traceability module, and a knowledge base expansion module. The task scheduling module is configured with a task scheduling intelligent agent, which is used to perform structural and semantic analysis on scientific data and generate audit task plans with dependencies through feature extraction and logical clustering. The audit algorithm generation module has a built-in audit algorithm generation intelligent agent, which is used to retrieve the appropriate algorithm from the knowledge base based on task data, or to call the generative intelligent agent to dynamically construct the execution logic. The task execution module is configured with a task execution agent and a secure execution environment, which is used to run the audit algorithm in an isolated secure execution environment and output standardized results including location, type and confidence level. The quantitative evaluation module is configured with a quantitative evaluation intelligent agent, which is used to normalize and map multi-source results and fuse them with weights to generate a multi-dimensional structured evaluation report. The full-link traceability module is configured with traceability intelligent agents to record the operation logs and data flow information of each intelligent agent unit and construct a full-link traceability archive. The knowledge base extension module is configured with algorithm-extended intelligent agents, which automatically solidify the review rules based on historical pattern matching and comprehensive scoring; the supporting architecture of the multi-agent system also includes a unified message bus and a status monitoring unit, which are used for data sharing among intelligent agents and end-to-end adaptive task rescheduling.

9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that: When the processor executes the computer program, it implements the steps of the multi-agent-based scientific data quality auditing method according to any one of claims 1 to 7.

10. A computer-readable storage medium having a computer program stored thereon, characterized in that: When the computer program is executed by the processor, it implements the steps of the multi-agent-based scientific data quality auditing method according to any one of claims 1 to 7.