A man-machine collaborative analysis method and platform for intelligent question answering in an industrial scene

By constructing a three-dimensional semantic association index graph and a term entity resolution mechanism in industrial scenarios, the problems of low cross-database query efficiency and semantic parsing bias of multi-source heterogeneous data are solved, enabling accurate cross-database joint queries and result reliability assessment, and forming a continuously optimized query data platform.

CN122240799APending Publication Date: 2026-06-19ZHEJIANG CHINAJEY SOFTWARE TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG CHINAJEY SOFTWARE TECH CO LTD
Filing Date
2026-05-22
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing intelligent query methods struggle to achieve cross-database semantic-level association mapping of multi-source heterogeneous data in industrial scenarios. They lack systematic modeling of industrial terminology, resulting in large deviations in query intent recognition, low accuracy in query statement generation, and a lack of human-machine collaborative verification mechanisms and continuous optimization capabilities for terminology models.

Method used

By constructing a three-dimensional semantic association index graph with production batch identifiers as the association anchors, cross-database association mapping is performed, and term entity resolution is performed on user natural language query commands to generate clarification information. The term matching weights are updated based on user confirmation and correction feedback to achieve cross-database joint query with accurate query intent.

Benefits of technology

It improves the efficiency of cross-database queries for multi-source heterogeneous data in industrial scenarios, enhances the accuracy of semantic understanding and the reliability of query results, and forms a complete human-machine collaborative closed loop of data inquiry, clarification, query, verification and feedback, continuously optimizing the performance of the intelligent data inquiry platform.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240799A_ABST
    Figure CN122240799A_ABST
Patent Text Reader

Abstract

This invention relates to the field of data management technology and discloses a human-machine collaborative analysis method and platform for intelligent data querying in industrial scenarios. The method includes: generating a three-dimensional semantic association index map using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors; performing terminology entity resolution on natural language data query commands from users in industrial production to generate clarification information; generating precise data query intent based on user confirmation and correction feedback on the clarification information, and converting the precise data query intent into an executable cross-database joint query statement based on the three-dimensional semantic association index map; performing data retrieval and confidence assessment on the cross-database joint query statement, and pushing the retrieval results and confidence assessment results to the user for verification; updating the terminology matching weights of the professional terminology ontology based on the verification results and user correction feedback. This invention improves the query accuracy and result reliability of the industrial intelligent data querying platform.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data management technology, and in particular to a human-machine collaborative analysis method and platform for intelligent data querying in industrial scenarios. Background Technology

[0002] With the development of database management systems and natural language processing (NLP) technology, intelligent querying methods based on natural language for database queries are increasingly being applied in the field of data analysis. Users can directly obtain data query results through natural language commands, lowering the technical threshold for data acquisition. However, existing intelligent querying methods lack effective modeling tools for the relationships between different data sources when facing multi-source heterogeneous data query scenarios. This makes it difficult to achieve semantic-level association mapping across databases, requiring users to initiate queries on multiple data sources separately and manually integrate the results, resulting in low query efficiency. Furthermore, existing intelligent querying methods lack mechanisms for identifying and resolving semantic ambiguities in domain-specific terminology during the semantic parsing of natural language commands. This makes it difficult to accurately identify synonymous ambiguities, vague parameter references, and unclear query scopes in the user's query intent, leading to deviations between the generated query and the user's true intent and insufficient accuracy of the query results. In addition, existing methods only provide raw data when returning query results, lacking quantitative evaluation methods for the reliability of the results. Users find it difficult to judge the reliability of the results, and the platform lacks the ability to continuously optimize based on user feedback, failing to iteratively improve the accuracy of terminology parsing and intent recognition based on corrections and verification data from historical queries.

[0003] In industrial settings such as industrial production, intelligent query platforms need to interface with diverse and heterogeneous data sources, including DCS (Distributed Control System), MES (Manufacturing Execution System), and LIMS (Laboratory Information Management System). These data sources encompass various types of data, such as real-time process parameters, production batch conditions, and quality inspection data, resulting in complex data structures and strong domain-specific characteristics. Intelligent query methods in industrial scenarios lack systematic modeling and semantic resolution techniques for industrial terminology. This makes it difficult to accurately translate users' colloquial and ambiguous natural language query commands into cross-database joint query statements involving multiple data sources, leading to significant errors in query intent recognition and low accuracy in query statement generation. Furthermore, during human-machine collaborative analysis, the lack of a multi-turn dialogue clarification mechanism based on user confirmation and correction makes it difficult to eliminate semantic ambiguity before query execution. Moreover, the lack of a closed-loop connection for manual verification and feedback collection after query results are returned makes query results prone to errors and prevents continuous optimization and updates of the terminology parsing model driven by user feedback. Consequently, the platform struggles to adapt to the ever-evolving terminology systems and query needs in industrial scenarios. Therefore, there is an urgent need to develop an intelligent data query and human-machine collaborative analysis method for industrial scenarios to solve the problems of low semantic parsing accuracy, lack of human-machine collaborative verification mechanism, and inability to continuously optimize terminology models in cross-database queries of multi-source heterogeneous data in industrial production, and to improve the semantic understanding accuracy, query result reliability, and model adaptive iteration capability of industrial intelligent data query platforms in complex production environments. Summary of the Invention

[0004] This invention provides a human-machine collaborative analysis method and platform for intelligent data querying in industrial scenarios to solve the problems mentioned in the background art.

[0005] To achieve the above objectives, this invention provides a human-machine collaborative analysis method for intelligent data querying in industrial scenarios, comprising: A1: Using multi-source heterogeneous data labeled with semantic tags in industrial production as the anchor points, establish cross-database association mapping and generate a three-dimensional semantic association index map; A2: Perform term entity resolution on the natural language question commands of users in industrial production to generate clarification information for the natural language question commands; A3: Based on the user's confirmation and correction feedback on the clarification information, generate the precise question intent of the clarification information, and based on the three-dimensional semantic association index graph, convert the precise question intent into an executable cross-database joint query statement; A4: Perform data retrieval and confidence assessment on executable cross-database join queries, and push the retrieval results and confidence assessment results to the user for verification; A5: Based on the verification results and user feedback, update the terminology matching weights of the industrial production terminology ontology.

[0006] In a preferred embodiment, the step of establishing a cross-database association mapping and generating a three-dimensional semantic association index map by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors includes: Semantic tags are used to annotate real-time process parameter data, production batch operating condition data, and quality inspection data of multi-source heterogeneous data in industrial production. Using the production batch identifier commonly contained in the semantic tags as the association anchor, a cross-database association mapping is established between each parameter field in the real-time process parameter data and each detection index field in the quality inspection data. A three-dimensional semantic association index map is generated using the production batch identifier, the parameter fields, and the detection index fields as association dimensions.

[0007] In a preferred embodiment, the step of performing term entity resolution on natural language question commands from users in industrial production to generate clarification information for the natural language question commands includes: The query entities, query dimensions, and query constraints in the natural language query commands are semantically similar to the professional terminology ontology in industrial production, and the synonymous semantic ambiguities in the natural language query commands are identified. The dimensional attribution of the query entity to the association dimension in the three-dimensional semantic association index graph is determined to identify the ambiguity in the parameter reference of the query entity. The query constraints are matched with the three-dimensional semantic association index graph to identify the points where the query range of the query constraints is unclear. The synonymous semantic ambiguities, the parameter referential ambiguities, and the query range ambiguities are integrated into clarification information that includes a preliminary question intent summary and a list of ambiguities to be confirmed. The clarification information is pushed to the user's interactive interface, waiting for the user to confirm and correct the clarification information.

[0008] In a preferred embodiment, the step of generating a precise question intent based on the user's confirmation and correction feedback on the clarification information, and converting the precise question intent into an executable cross-database join query statement based on the three-dimensional semantic association index graph, includes: The system receives user confirmation and correction operations for each ambiguous point in the clarification information, and integrates the unambiguous query entity, query dimension and query constraint to generate a precise query intent. In the three-dimensional semantic association index map, the association path is retrieved for the precise question intent to determine the target data source, target association field and target association path involved in the precise question intent; Based on the target association path, the precise query intent is decomposed into query subtasks for each target data source; Based on the query syntax characteristics of the target data source, the query subtasks are converted into corresponding query statements; Based on the association rules of the target association path, the query statements of each target data source are integrated into an executable cross-database joint query statement.

[0009] In a preferred embodiment, the step of performing data retrieval and confidence assessment on the executable cross-database join query statement, and pushing the retrieval results and confidence assessment results to the user for verification, includes: Based on the cross-database join query statement, data retrieval is performed on the data sources involved in the precise query intent to obtain the retrieval results; The confidence level of the search results is evaluated based on the data integrity dimension, semantic relevance dimension, and timeliness dimension. The search results and the confidence assessment results are pushed to the interactive interface for user verification.

[0010] In a preferred embodiment, the confidence assessment result is calculated using the following formula: in, The confidence assessment result of the search results. The weighting coefficient for the data integrity dimension. The number of fields that actually returned valid data in the data integrity dimension. The total number of fields involved in the precise questioning intent. The geometrically weighted index for the data integrity dimension. The weight coefficients for the semantic relevance dimension are... The semantic matching ratio is the semantic relevance dimension. It is the geometrically weighted index of the semantic relevance dimension. The weighting coefficients for the timeliness dimension are as follows: The time interval between the latest record time of the data in the search results and the current query time. It is the geometric weighted index of the timeliness dimension.

[0011] In a preferred embodiment, pushing the search results and the confidence assessment results to the interactive interface for user verification includes: The interactive interface displays the search results, the confidence assessment results, and the visualization information of the association path. The search results are pushed to the interactive interface, where users can manually verify, correct, and annotate the search results, generating feedback for correction.

[0012] In a preferred embodiment, updating the terminology matching weights of the industrial production terminology ontology based on verification results and user feedback includes: The user's confirmation and correction operations for each semantic ambiguity point in the clarification information are used to construct a clarification feedback vector; Users manually verify, correct, and anomaly label the search results to construct a verification feedback vector. Based on the clarification feedback vector and the verification feedback vector, the adjustment amount of the matching weight of each term in the terminology ontology is calculated; Based on the adjustment amount, the matching weights of each term in the terminology ontology are optimized and updated. The formula for calculating the term matching weight is as follows: in, The index number is assigned to each term in the terminology ontology. For index number Terminology, For the updated version Term matching weights for each term. For the first time before the update Term matching weights for each term. The corrected score for the term in the clarification feedback vector. The verification feedback score for the term in the verification feedback vector. The influence coefficient of the clarification feedback vector on the term matching weight update. The influence coefficient of the verification feedback vector on the term matching weight update.

[0013] In a preferred embodiment, calculating the adjustment amount of each term matching weight in the terminology ontology based on the clarification feedback vector and the verification feedback vector includes: A correction consistency analysis is performed on the user correction terms in the clarification feedback vector to obtain the correction score of the clarification feedback vector for each term; A data correction magnitude analysis is performed on the user verification retrieval results in the verification feedback vector to obtain the verification feedback score for each term; The adjusted score is obtained by weighting and summing the corrected score and the verification feedback score.

[0014] To address the aforementioned problems, this invention also provides a human-machine collaborative analysis platform for intelligent data querying in industrial scenarios, the platform comprising: The index graph construction module is used to establish cross-database association mapping and generate a three-dimensional semantic association index graph by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors. The terminology resolution and clarification module is used to resolve the terminology entities of natural language question commands from users in industrial production and generate clarification information for the natural language question commands. The query statement conversion module is used to generate the precise question intent of the clarification information based on the user's confirmation and correction feedback, and to convert the precise question intent into an executable cross-database joint query statement based on the three-dimensional semantic association index graph. The retrieval verification push module is used to perform data retrieval and confidence assessment on executable cross-database joint query statements, and push the retrieval results and confidence assessment results to users for verification. The weight feedback update module is used to update the term matching weights of the professional terminology ontology in industrial production based on the verification results and user correction feedback.

[0015] Compared with the prior art, the present invention has the following beneficial effects: 1. This invention constructs a three-dimensional semantic association index map with production batch identifiers as the association anchor points, and performs cross-database association mapping of real-time process parameter data from the DCS system, production batch operating condition data from the MES system, and quality inspection data from the LIMS system. This solves the problem of low cross-database query efficiency caused by multi-source heterogeneous data silos in industrial scenarios. Through a terminology entity resolution mechanism based on an industrial domain professional terminology ontology, it identifies semantic ambiguities such as synonymous terminology ambiguity, vague parameter referencing, and unclear query scope, and generates clarification information for user confirmation, solving the problem of large intent recognition deviations in the semantic parsing process of industrial professional terms in existing intelligent query methods. Furthermore, it utilizes a three-dimensional semantic association index map. The semantic association index graph target association path retrieval decomposes the precise question intent into query subtasks of each target data source and integrates them into a cross-database joint query statement, solving the problem of low query efficiency caused by manually integrating results after querying multiple data sources separately; it quantitatively evaluates the search results by using a confidence assessment formula based on three dimensions: data integrity, semantic relevance, and timeliness, solving the problem of the lack of credible quantifiable support for query results; and it incrementally optimizes and updates the terminology matching weights in the professional terminology ontology by collecting user feedback on clarification information and verification feedback on search results, solving the problem of existing methods lacking the ability to continuously optimize the model driven by user feedback.

[0016] 2. This invention enables multi-data source association retrieval with a single query command through a three-dimensional semantic association index graph, avoiding the need for users to initiate separate queries and manually integrate multiple data sources, significantly shortening the response time from query to obtaining association results. Regarding semantic parsing accuracy, through terminology entity resolution and multi-round clarification interaction mechanisms, semantic ambiguity is eliminated before query execution, transforming users' colloquial and ambiguous natural language commands into precise cross-database joint query statements, effectively improving the intent recognition accuracy of industrial terminology and the accuracy of query statement generation. Regarding result reliability, by placing... The reliability assessment formula provides a three-dimensional quantitative evaluation of the search results in terms of data integrity, semantic relevance, and timeliness, offering users a quantifiable basis for judging the reliability of the results and enhancing the interpretability and user trust of the query results. In terms of model adaptability, by using user correction and verification operations as feedback data to drive the continuous optimization and updating of term matching weights, the platform can continuously adapt to the evolving professional terminology system and query needs in industrial scenarios, forming a complete human-machine collaborative closed loop of question counting—clarification—query—verification—feedback optimization, continuously improving the overall performance of the intelligent question counting platform in complex industrial production environments. Attached Figure Description

[0017] Figure 1 This is a flowchart illustrating a human-machine collaborative analysis method for intelligent data querying in an industrial setting, provided by an embodiment of the present invention. Figure 2 This is a functional module diagram of a human-machine collaborative analysis platform for intelligent data querying in an industrial setting, provided by an embodiment of the present invention. The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0018] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0019] This application provides a human-machine collaborative analysis method for intelligent data querying in industrial scenarios. The executing entity of this method includes, but is not limited to, at least one of the following electronic devices that can be configured to execute the method provided in this application: a server, a terminal, etc. In other words, the human-machine collaborative analysis method for intelligent data querying in industrial scenarios can be executed by software or hardware installed on a terminal device or a server device. The server includes, but is not limited to, a single server, a server cluster, a cloud server, or a cloud server cluster. The server can be an independent server or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms.

[0020] Reference Figure 1 The diagram shown is a flowchart illustrating a human-machine collaborative analysis method for intelligent data querying in an industrial scenario, according to an embodiment of the present invention. In this embodiment, the human-machine collaborative analysis method for intelligent data querying in an industrial scenario includes: A1: Using multi-source heterogeneous data labeled with semantic tags in industrial production as the anchor points, establish cross-database association mapping and generate a three-dimensional semantic association index map; In this embodiment of the invention, the step of establishing a cross-database association mapping and generating a three-dimensional semantic association index map by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors includes: Semantic tags are used to annotate real-time process parameter data, production batch operating condition data, and quality inspection data of multi-source heterogeneous data in industrial production. Using the production batch identifier commonly contained in the semantic tags as the association anchor, a cross-database association mapping is established between each parameter field in the real-time process parameter data and each detection index field in the quality inspection data. A three-dimensional semantic association index map is generated using the production batch identifier, the parameter fields, and the detection index fields as association dimensions.

[0021] The system integrates real-time process parameter data collected by the DCS system in industrial production. This real-time process parameter data includes process control parameters such as reaction temperature, reaction pressure, flow rate, and liquid level. Semantic tags are applied to each parameter field in the real-time process parameter data according to an industrial terminology ontology. These semantic tags include a production batch identifier, a process stage identifier, and a parameter category identifier. The production batch identifier identifies the production batch number corresponding to the real-time process parameter data; the process stage identifier identifies the industrial process stage to which the real-time process parameter data belongs; and the parameter category identifier identifies the parameter type of the real-time process parameter data.

[0022] The system accesses production batch condition data recorded by the MES system in industrial production. This production batch condition data includes production management information such as the start time, end time, material input, and output of the production batch. Semantic tags are applied to each condition record field in the production batch condition data according to the industrial domain terminology ontology. The semantic tags include the production batch identifier and the process step identifier to which the data belongs. The production batch identifier is used to identify the production batch number corresponding to the production batch condition data, and the process step identifier is used to identify the industrial process step to which the production batch condition data belongs.

[0023] The quality inspection data stored in the LIMS system in industrial production is accessed. The quality inspection data includes quality inspection indicators such as product purity, impurity content, color, and moisture. The semantic tags of each inspection indicator field in the quality inspection data are annotated according to the professional terminology ontology library of the industrial field. The semantic tags include the production batch identifier to which the data belongs and the quality indicator identifier. The production batch identifier is used to identify the production batch number corresponding to the quality inspection data, and the quality indicator identifier is used to identify the inspection indicator type of the quality inspection data.

[0024] Using the production batch identifier commonly contained in the semantic tags as the association anchor, and according to the time alignment relationship under the same production batch, a cross-database association mapping is established between each parameter field in the real-time process parameter data and each detection index field in the quality inspection data. The time alignment relationship refers to matching the timestamp of the real-time process parameter data collected by the DCS system under the same production batch with the detection time of the quality inspection data recorded by the LIMS system, and associating data records with similar times. The cross-database association mapping records the association relationship between the real-time process parameter data and the quality inspection data, as well as the corresponding rules of the associated fields.

[0025] A three-dimensional semantic association index map is generated using the production batch identifier, the parameter fields, and the detection index fields as three association dimensions. The three-dimensional semantic association index map uses the production batch identifier as the first association dimension, the parameter fields in the real-time process parameter data as the second association dimension, and the detection index fields in the quality inspection data as the third association dimension. It records the association paths and association weights between the three association dimensions. The association weights are assigned based on the degree of influence of the parameter fields on the detection index fields in the same production batch. The degree of influence is determined by analyzing the correspondence between the changing trends of the parameter fields and the detection results of the detection index fields in the same production batch.

[0026] The beneficial effects are as follows: By constructing a three-dimensional semantic association index map with production batch identifiers as the association anchors, real-time process parameter data from the DCS system, production batch operating condition data from the MES system, and quality inspection data from the LIMS system are mapped across databases. This achieves semantic-level association and integration of multi-source heterogeneous data in industrial production, eliminates the constraints of data silos on cross-database queries, and enables users to complete multi-database association retrieval with a single query command. This significantly shortens the response time from querying data to obtaining association results and improves the cross-database query efficiency of the industrial intelligent query platform in complex production environments.

[0027] A2: Perform term entity resolution on the natural language question commands of users in industrial production to generate clarification information for the natural language question commands; In this embodiment of the invention, the step of performing term entity resolution on natural language question commands from users in industrial production to generate clarification information for the natural language question commands includes: The query entities, query dimensions, and query constraints in the natural language query commands are semantically similar to the professional terminology ontology in industrial production, and the synonymous semantic ambiguities in the natural language query commands are identified. The dimensional attribution of the query entity to the association dimension in the three-dimensional semantic association index graph is determined to identify the ambiguity in the parameter reference of the query entity. The query constraints are matched with the three-dimensional semantic association index graph to identify the points where the query range of the query constraints is unclear. The synonymous semantic ambiguities, the parameter referential ambiguities, and the query range ambiguities are integrated into clarification information that includes a preliminary question intent summary and a list of ambiguities to be confirmed. The clarification information is pushed to the user's interactive interface, waiting for the user to confirm and correct the clarification information.

[0028] The natural language query command is segmented to extract the query entity, query dimension, and query constraints contained in the natural language query command. The query entity refers to the specific data object mentioned by the user in the natural language query command, such as the name of a specific parameter or indicator in industrial production, such as reaction temperature or product purity. The query dimension refers to the angle from which the user wants to analyze the query entity, such as time dimension or batch dimension. The query constraints refer to the limiting conditions set by the user for the query scope, such as a certain batch number or a certain time period.

[0029] The query entity is subjected to semantic similarity analysis with the professional terminology ontology database in industrial production. The professional terminology ontology database stores standard terms in the field of industrial production and their synonyms, near-synonyms, and hyponyms. The semantic similarity analysis refers to calculating the semantic distance between the query entity and each term node in the professional terminology ontology database one by one. When the semantic similarity values ​​between the query entity and multiple term nodes all exceed a preset threshold and the differences between them are small, it is determined that the query entity has a synonym ambiguity point. The synonym ambiguity point means that the query entity has multiple candidate terms in the professional terminology ontology database that are semantically similar but point to different meanings. For example, the temperature entered by the user may simultaneously point to the reaction temperature and the discharge temperature.

[0030] The query entity is matched with the associated dimensions in the three-dimensional semantic association index graph. The dimension assignment determination refers to matching the query entity with the production batch identifier association dimension, parameter field association dimension, and detection indicator association dimension in the three-dimensional semantic association index graph. When the query entity cannot be uniquely assigned to a single association dimension, it is determined that the query entity has a parameter reference ambiguity. The parameter reference ambiguity means that the query entity is associated with multiple association dimensions in the three-dimensional semantic association index graph at the same time, making it impossible to determine the specific data source that the user intends to point to. For example, the content entered by the user may simultaneously point to the online detection content in the DCS system and the laboratory detection content in the LIMS system.

[0031] The query constraints are matched with the three-dimensional semantic association index graph. The association path matching refers to comparing the limiting information in the query constraints with the association paths recorded in the three-dimensional semantic association index graph. When the coverage of the query constraints is unclear or conflicts with the association path, it is determined that the query constraints have an unclear query scope. The unclear query scope means that the query constraints fail to specify the time period, batch range or data source of the query, which makes it impossible for the platform to determine the execution scope of the query. For example, the user only inputs recent temperature data without specifying a specific time period and batch number.

[0032] The platform integrates the synonymous term ambiguities, parameter referential ambiguities, and query scope ambiguities, categorizes them according to ambiguity type, and generates clarification information containing a preliminary query intent summary and a list of ambiguities to be confirmed. The preliminary query intent summary refers to the platform's initial understanding of the natural language query command, describing the user's query intent as currently understood by the platform in concise text. The list of ambiguities to be confirmed refers to all ambiguities that require user confirmation or correction, categorized by ambiguity type. Each ambiguity includes a description of the ambiguity type, the current platform judgment result, and candidate options available to the user.

[0033] The clarification information is pushed to the user's interactive interface, where the user waits for confirmation and correction. The interactive interface displays the preliminary question intent summary and the list of ambiguities to be confirmed in a visual manner. The user can confirm or modify the preliminary question intent summary and select or correct each ambiguous item in the list of ambiguities to be confirmed one by one. After receiving the user's confirmation and correction, the platform proceeds to the next step of the processing flow.

[0034] The beneficial effects are as follows: By using a terminology entity resolution mechanism based on an industrial domain terminology ontology, the query entities, query dimensions, and query constraints in users' natural language query commands are subjected to semantic similarity analysis, dimension attribution determination, and association path matching, respectively. This systematically identifies ambiguities in synonymous terms, vague parameter referencing, and unclear query scope, and generates clarification information containing a preliminary query intent summary and a list of ambiguities to be confirmed for user confirmation and correction. This effectively solves the problem of large intent recognition deviations in the semantic parsing process of users' colloquial and ambiguous natural language query commands in industrial scenarios. Through a multi-round clarification interaction mechanism, semantic ambiguities are eliminated before query execution, significantly improving the intent recognition accuracy of industrial domain terminology and the accuracy of query statement generation.

[0035] A3: Based on the user's confirmation and correction feedback on the clarification information, generate the precise question intent of the clarification information, and based on the three-dimensional semantic association index graph, convert the precise question intent into an executable cross-database joint query statement; In this embodiment of the invention, the step of generating a precise question intent based on the user's confirmation and correction feedback on the clarification information, and converting the precise question intent into an executable cross-database joint query statement based on the three-dimensional semantic association index graph, includes: The system receives user confirmation and correction operations for each ambiguous point in the clarification information, and integrates the unambiguous query entity, query dimension and query constraint to generate a precise query intent. In the three-dimensional semantic association index map, the association path is retrieved for the precise question intent to determine the target data source, target association field and target association path involved in the precise question intent; Based on the target association path, the precise query intent is decomposed into query subtasks for each target data source; Based on the query syntax characteristics of the target data source, the query subtasks are converted into corresponding query statements; Based on the association rules of the target association path, the query statements of each target data source are integrated into an executable cross-database joint query statement.

[0036] The system receives user confirmation and correction operations for each ambiguous point in the clarified information. The confirmation operation refers to the user selecting and confirming each ambiguous item in the list of ambiguous points to be confirmed, choosing the correct terminology, the correct dimension affiliation, and the correct query scope from the candidate options provided by the platform. The correction operation refers to the user modifying the query intent initially understood by the platform, supplementing or correcting the specific content in the query entity, query dimension, and query constraints. The system integrates the user-confirmed query entity, query dimension, and query constraints. The integration refers to organizing the unambiguous query entity, query dimension, and query constraints according to the logical structure of the query intent to form a semantically clear and unambiguous precise query intent. The precise query intent clarifies the specific data object, analysis dimension, and query scope limitation conditions that the user wishes to query.

[0037] In the three-dimensional semantic association index graph, the precise query intent is searched for related paths. The related path search refers to matching the query entity in the precise query intent with each association dimension in the three-dimensional semantic association index graph, filtering out all related paths related to the precise query intent from the three-dimensional semantic association index graph, and determining the target data source, target related field, and target related path involved in the precise query intent. The target data source refers to the data storage system that the precise query intent needs to access, including one or more of the DCS system, MES system, and LIMS system. The target related field refers to the specific data field that the precise query intent needs to query. The target related path refers to the association path connecting the target data sources obtained from the three-dimensional semantic association index graph.

[0038] Based on the target association path, the precise query intent is decomposed into query subtasks for each target data source. The decomposition refers to dividing the query task in the precise query intent into independent query subtasks for each target data source according to the division of responsibilities of each data source in the target association path. For example, when the precise query intent involves real-time process parameter data in the DCS system and quality inspection data in the LIMS system, the precise query intent is decomposed into DCS query subtasks and LIMS query subtasks. The DCS query subtask refers to the query task for real-time process parameter data in the DCS system, and the LIMS query subtask refers to the query task for quality inspection data in the LIMS system.

[0039] According to the query syntax characteristics of each target data source, the query subtasks are converted into corresponding query statements. The query syntax characteristics refer to the syntax rules of the database query language used by each target data source. For example, the time-series database in the DCS system uses time-series query syntax, the relational database in the MES system uses structured query language syntax, and the relational database in the LIMS system also uses structured query language syntax. The conversion refers to converting the DCS query subtask into a DCS query statement according to the time-series query syntax, converting the MES query subtask into an MES query statement according to the structured query language syntax, and converting the LIMS query subtask into a LIMS query statement according to the structured query language syntax.

[0040] Based on the association rules of the target association path, the query statements of each target data source are integrated into an executable cross-database joint query statement. The association rules refer to the data association conditions between the data sources recorded in the target association path, including association conditions based on production batch identifiers, association conditions based on time alignment, and association conditions based on process links. The integration refers to connecting the DCS query statement, the MES query statement, and the LIMS query statement through association conditions according to the association rules to form a complete executable cross-database joint query statement. The executable cross-database joint query statement can access multiple target data sources simultaneously and return the associated query results in one execution.

[0041] The beneficial effects are as follows: By receiving user confirmation and correction operations for each ambiguous point in the clarified information, the query entity, query dimension, and query constraint conditions after the ambiguity is eliminated are integrated into a precise query intent. Based on the three-dimensional semantic association index graph, association path retrieval is performed. The precise query intent is decomposed into query sub-tasks of each target data source and converted into corresponding query statements. Finally, based on association rules, it is integrated into an executable cross-database joint query statement. This realizes the accurate conversion from user-language query commands to cross-database joint query statements, solving the problems of low query statement generation accuracy and low efficiency of manual integration after querying multiple data sources separately in multi-source heterogeneous data scenarios. A single query command can complete the association retrieval of multiple data sources, significantly improving the query statement generation efficiency and cross-database query execution efficiency of the industrial intelligent query platform.

[0042] A4: Perform data retrieval and confidence assessment on executable cross-database join queries, and push the retrieval results and confidence assessment results to the user for verification; In this embodiment of the invention, the step of performing data retrieval and confidence assessment on the executable cross-database join query statement, and pushing the retrieval results and confidence assessment results to the user for verification, includes: Based on the cross-database join query statement, data retrieval is performed on the data sources involved in the precise query intent to obtain the retrieval results; The confidence level of the search results is evaluated based on the data integrity dimension, semantic relevance dimension, and timeliness dimension. The search results and the confidence assessment results are pushed to the interactive interface for user verification.

[0043] The formula for calculating the confidence level assessment result is as follows: in, The confidence assessment result of the search results. The weighting coefficient for the data integrity dimension. The number of fields that actually returned valid data in the data integrity dimension. The total number of fields involved in the precise questioning intent. The geometrically weighted index for the data integrity dimension. The weight coefficients for the semantic relevance dimension are... The semantic matching ratio is the semantic relevance dimension. It is the geometrically weighted index of the semantic relevance dimension. The weighting coefficients for the timeliness dimension are as follows: The time interval between the latest record time of the data in the search results and the current query time. It is the geometric weighted index of the timeliness dimension.

[0044] The step of pushing the search results and the confidence assessment results to the interactive interface for user verification includes: The interactive interface displays the search results, the confidence assessment results, and the visualization information of the association path. The search results are pushed to the interactive interface, where users can manually verify, correct, and annotate the search results, generating feedback for correction.

[0045] Based on the cross-database joint query statement, data retrieval is performed on the data sources involved in the precise data query intent. The data retrieval refers to sending data access requests to the DCS system, MES system, and LIMS system respectively according to the query conditions and association rules defined in the cross-database joint query statement, extracting data records that meet the query conditions from each data source, including real-time process parameter data records returned by the DCS system, production batch operating condition data records returned by the MES system, and quality inspection data records returned by the LIMS system. The data records returned by each data source are associated and concatenated according to the association conditions in the cross-database joint query statement to generate retrieval results. The retrieval results refer to a data set integrated into a unified format after cross-database association.

[0046] The search results are subjected to a data integrity check. This check involves examining each field in the search results to see if its value is null and if it conforms to the field format specifications of the corresponding data source. The number of non-null fields that conform to the data format specifications in the search results is counted, and the ratio of this number to the total number of fields involved in the precise query intent is calculated to obtain a data integrity ratio. The data integrity ratio reflects the coverage of valid data in the search results. The closer the data integrity ratio is to one, the fewer missing or abnormal fields there are in the search results, and the higher the degree of data integrity.

[0047] The semantic relevance of the search results is calculated. This calculation involves semantically matching each query term in the precise query intent with the names of the data fields returned in the search results, and then calculating the ratio of the number of successfully matched query terms to the total number of query terms. The semantic matching ratio reflects the degree of semantic fit between the data fields returned in the search results and the user's query intent. A higher semantic matching ratio indicates that the data fields returned in the search results better match the user's actual query needs.

[0048] The timeliness analysis of the search results refers to obtaining the latest record timestamp of each data record in the search results, calculating the time difference between it and the current query time, and obtaining the timeliness time interval. The timeliness time interval reflects the freshness of the data in the search results. The smaller the timeliness time interval, the closer the data in the search results is to the current time, and the better the timeliness of the data.

[0049] Based on the data integrity ratio, the semantic matching ratio, and the timeliness time interval, the confidence assessment result of the search results is calculated. The confidence assessment result refers to the quantitative score of the overall credibility of the search results, with a value range of zero to one. The higher the confidence assessment result, the better the data integrity of the search results, the higher the semantic fit with the user's query intent, and the stronger the timeliness of the data.

[0050] The interactive interface displays the search results, the confidence assessment results, and the visualization information of the association paths. The visualization information of the association paths refers to the graphical display of the association paths from the query starting point to each data source in the three-dimensional semantic association index map, including the connection relationship between each data source, the correspondence relationship of the association fields, and the magnitude of the association weight, so that users can intuitively understand the source of each data in the search results and the association logic between each data source.

[0051] The search results are pushed to the interactive interface, where users can manually verify, correct, and anomaly mark the results. Manual verification refers to the user reviewing each data record in the search results to determine if the data content is correct and meets expectations. Data correction refers to the user manually modifying erroneous data found during the verification process and replacing the original data with corrected data. Anomaly marking refers to the user marking abnormal data found during the verification process, indicating the anomaly type and reason. Correction feedback refers to all operation records generated by the user during the manual verification, data correction, and anomaly marking process, including verified data records, corrected data records, and data records marked as abnormal.

[0052] This is a confidence assessment result for the search results, with a value ranging from zero to one, reflecting a quantitative score that indicates the overall credibility of the search results. The closer the value is to one, the better the data integrity of the search results, the higher the semantic fit with the user's query intent, and the stronger the timeliness of the data. This is the weighting coefficient for the data integrity dimension, used to adjust the importance of the data integrity dimension in the confidence assessment. This refers to the number of fields that actually return valid data in the data integrity dimension. The actual returned valid data refers to the fields in the search results that are not empty and conform to the data format specifications. The total number of fields involved in a precise query intent, i.e., the total number of all data fields that the user expects to query in their query intent. It is a geometrically weighted index for the data integrity dimension, used to adjust the shape of the curve representing the impact of the data integrity dimension on the overall confidence level. This is the weighting coefficient for the semantic relevance dimension, used to adjust the importance of the semantic relevance dimension in confidence assessment. The semantic matching ratio is the semantic relevance dimension, which is the ratio of the number of query terms in the exact query intent that successfully match the names of the returned data fields in the search results to the total number of query terms. It is a geometrically weighted index for the semantic relevance dimension, used to adjust the shape of the influence curve of the semantic relevance dimension on the overall confidence level. This is the weighting coefficient for the timeliness dimension, used to adjust the importance of the timeliness dimension in the confidence assessment. The time interval between the latest record time in the search results and the current query time. The smaller the value, the closer the data is to the current moment. This is a geometrically weighted index for the timeliness dimension, used to adjust the shape of the influence curve of the timeliness dimension on the overall confidence level. , and The sum of the three equals one.

[0053] First, calculate the evaluation value for the data integrity dimension. Divide the number of fields that actually returned valid data by the total number of fields involved in the precise query intent to obtain the data integrity ratio. Multiply the data integrity ratio by the weighting coefficient of the data integrity dimension to obtain the weighted value of the data integrity dimension. Then, apply the weighted value of the data integrity dimension to... The evaluation value for the data integrity dimension is obtained by exponentiation; then, the evaluation value for the semantic relevance dimension is calculated by multiplying the semantic matching ratio by the weight coefficient of the semantic relevance dimension to obtain the weighted value of the semantic relevance dimension, and then applying the weighted value of the semantic relevance dimension... The exponentiation operation yields the evaluation value for the semantic relevance dimension; then, the evaluation value for the timeliness dimension is calculated, taking a timeliness time interval. The reciprocal of the factor is used as the timeliness factor. This timeliness factor is multiplied by the weighting coefficient of the timeliness dimension to obtain a weighted value for the timeliness dimension. The weighted value of the timeliness dimension is then... The evaluation value for the timeliness dimension is obtained by exponentiation; finally, the evaluation values ​​for the data integrity dimension, the semantic relevance dimension, and the timeliness dimension are multiplied together to obtain the confidence evaluation result of the search result.

[0054] The beneficial effects are as follows: By evaluating confidence based on three dimensions—data integrity, semantic relevance, and timeliness—the search results are quantitatively scored, providing users with a quantifiable basis for judging the reliability of the results. This solves the problem that existing intelligent query methods lack quantifiable support for returning reliable query results. By displaying search results, confidence evaluation results, and visualization of related paths in the interactive interface, users are allowed to manually verify, correct, and anomaly label the search results. This achieves a closed-loop connection between automatic querying and manual verification, enhancing the interpretability and user trust of the search results. At the same time, collecting user feedback provides data support for the continuous optimization of subsequent term matching weights.

[0055] A5: Based on the verification results and user feedback, update the terminology matching weights of the industrial production terminology ontology.

[0056] In this embodiment of the invention, updating the terminology matching weights of the industrial production terminology ontology based on verification results and user feedback includes: The user's confirmation and correction operations for each semantic ambiguity point in the clarification information are used to construct a clarification feedback vector; Users manually verify, correct, and anomaly label the search results to construct a verification feedback vector. Based on the clarification feedback vector and the verification feedback vector, the adjustment amount of the matching weight of each term in the terminology ontology is calculated; Based on the adjustment amount, the matching weights of each term in the terminology ontology are optimized and updated. The formula for calculating the term matching weight is as follows: in, The index number is assigned to each term in the terminology ontology. For index number Terminology, For the updated version Term matching weights for each term. For the first time before the update Term matching weights for each term. The corrected score for the term in the clarification feedback vector. The verification feedback score for the term in the verification feedback vector. The influence coefficient of the clarification feedback vector on the term matching weight update. The influence coefficient of the verification feedback vector on the term matching weight update.

[0057] The step of calculating the adjustment amount of each term matching weight in the terminology ontology based on the clarification feedback vector and the verification feedback vector includes: A correction consistency analysis is performed on the user correction terms in the clarification feedback vector to obtain the correction score of the clarification feedback vector for each term; A data correction magnitude analysis is performed on the user verification retrieval results in the verification feedback vector to obtain the verification feedback score for each term; The adjusted score is obtained by weighting and summing the corrected score and the verification feedback score.

[0058] The system collects user confirmation and correction operations for each semantic ambiguity point in the clarification information. The confirmation operation refers to the user selecting the correct term from the candidate term options provided by the platform, and the correction operation refers to the user inputting custom correction content for the ambiguity points determined by the platform. The user's confirmation and correction operations for each semantic ambiguity point are classified and summarized according to the term dimension. The frequency of each term being corrected by the user in the clarification information and the consistency of different correction directions for the term are statistically analyzed. A clarification feedback vector is constructed. The clarification feedback vector is a vector data structure that uses each term in the professional terminology ontology as an index and the statistical results of the user's correction operations for each term as values. It records the user's correction preferences for each term in multiple rounds of clarification interaction.

[0059] The system collects user feedback on the search results, including manual verification, data correction, and anomaly labeling. Manual verification involves users reviewing each data record in the search results and determining its accuracy. Data correction involves users manually modifying erroneous data and replacing the original data. Anomaly labeling involves users marking abnormal data and recording the anomaly type and cause. The user feedback operations are categorized and summarized according to terminology. The system calculates the verification pass rate for each term and the extent of user corrections to the corresponding data, constructing a verification feedback vector. This vector is an indexed vector in the terminology ontology and uses the statistical results of user verification operations for each term as its value. It records the user's level of acceptance and correction tendency towards the query results for each term during the verification process.

[0060] A correction consistency analysis is performed on the user-corrected terms in the clarification feedback vector. This analysis involves statistically analyzing whether the user's correction direction for each term is consistent across multiple clarification interactions. A higher correction consistency value indicates that multiple corrections to the same term all point to the same target term, while a lower value indicates that the user's correction direction for the same term differs. The correction frequency and the correction consistency value are weighted and summed to obtain the clarification feedback correction score for each term. This score reflects the intensity and certainty of the user's correction of the term during the clarification process. A higher score indicates that the term requires more adjustment of its matching weight.

[0061] The data correction magnitude analysis is performed on the user-verified search results in the verification feedback vector. The data correction magnitude analysis refers to, for each term, statistically analyzing the proportion and degree of data record correction made by users in the search results corresponding to that term. When the proportion and magnitude of user correction for the search results corresponding to a certain term are high, it indicates that the matching accuracy of that term is low. The verification pass rate and the data correction magnitude value are weighted and summed to obtain the verification feedback verification score for each term. The verification feedback verification score reflects the user's degree of acceptance and correction tendency for that term during the verification process. The higher the score, the more the matching weight of that term needs to be adjusted.

[0062] The adjustment amount for each term matching weight is obtained by weighting and summing the clarification feedback correction score and the verification feedback verification score. The weighted summation refers to multiplying the clarification feedback correction score by the clarification feedback influence coefficient and the verification feedback verification score by the verification feedback influence coefficient, and then adding the two together to obtain the adjustment amount. The adjustment amount refers to the value that needs to be increased or decreased for the current matching weight of each term in the terminology ontology. A positive adjustment amount indicates that the matching weight of the term needs to be increased, and a negative adjustment amount indicates that the matching weight of the term needs to be decreased.

[0063] Based on the adjustment amount, the matching weights of each term in the terminology ontology are optimized and updated. The optimization and update refers to calculating the matching weights of each term before the update and the adjustment amount according to the preset update rules to obtain the matching weights of each term after the update. The updated matching weights are then written into the corresponding term nodes in the terminology ontology, replacing the original matching weight values. This allows the term matching weights in the terminology ontology to be continuously optimized based on the user's historical correction and verification operations, thereby improving the accuracy of term entity resolution in subsequent query commands.

[0064] An index number is assigned to each term in the terminology ontology database to uniquely identify each term node in the terminology ontology database. For index number The terminology, i.e., the terminology in the ontology database with index number […]. The corresponding industry-specific technical terms. For the updated version The term matching weight, ranging from zero to one, reflects the matching priority of the term in the term entity resolution process after user feedback and optimization. A higher value indicates that the term is more likely to be selected first in semantic similarity matching. This is the term matching weight of the i-th term before the update, with a value ranging from zero to one, reflecting the matching priority of this term before the update. The correction score for the term in the clarification feedback vector ranges from zero to one, reflecting the strength and certainty of the user's correction of the term during multiple rounds of clarification interaction. A higher value indicates that the term is corrected by users more frequently during the clarification process and that the corrections are more consistent in direction. The verification feedback score for a term in the verification feedback vector ranges from zero to one, reflecting the user's level of acceptance and tendency to make corrections to the search results corresponding to that term. A higher value indicates a lower verification pass rate for the search results corresponding to that term and a greater degree of data correction. The influence coefficient of the clarification feedback vector on the term matching weight update is used to adjust the contribution ratio of clarification feedback in the weight update. The influence coefficient of the verification feedback vector on the term matching weight update is used to adjust the contribution ratio of verification feedback in the weight update. and The sum of them is less than or equal to one.

[0065] First, calculate the contribution value of the clarification feedback, and then use the terminology... The correction score in the clarification feedback vector is multiplied by the clarification feedback influence coefficient to obtain the contribution value of the clarification feedback to the term matching weight update; then the contribution value of the verification feedback is calculated, and the term... The verification feedback score in the verification feedback vector is multiplied by the verification feedback influence coefficient to obtain the contribution value of the verification feedback to the term matching weight update; then the retention value is calculated by subtracting one from the value of the verification feedback. Subtract The retention factor is obtained, and the retention factor is multiplied by the term. The matching weights before the update are used to obtain a retention value, which reflects the proportion of the matching weights before the update that were retained in this update; finally, the contribution values ​​of the clarification feedback, the verification feedback, and the retention value are added together to obtain the terminology. The updated matching weight is then written into the corresponding term node in the terminology ontology, replacing the original matching weight value.

[0066] The beneficial effects are as follows: By collecting user feedback on the correction of clarification information and the verification feedback on the search results, clarification feedback vectors and verification feedback vectors are constructed. Based on the correction consistency analysis and data correction magnitude analysis, the adjustment amount of each term matching weight is calculated, and the term matching weights in the professional terminology ontology are continuously optimized and updated. This realizes the adaptive iterative optimization of the terminology parsing model based on user feedback, solves the problem that existing intelligent question counting methods lack the ability to continuously optimize the model driven by user feedback, and forms a complete human-machine collaborative closed loop of question counting, clarification, query, verification, and feedback optimization. This enables the platform to continuously adapt to the evolving professional terminology system and query needs in industrial scenarios, and continuously improve the semantic understanding accuracy and query accuracy of the intelligent question counting platform in complex industrial production environments.

[0067] like Figure 2 The diagram shown is a functional block diagram of a human-machine collaborative analysis platform for intelligent data retrieval in an industrial scenario, provided by an embodiment of the present invention.

[0068] The intelligent data querying and human-machine collaborative analysis platform 100 for industrial scenarios described in this invention can be installed in electronic devices. Depending on the functions implemented, the intelligent data querying and human-machine collaborative analysis platform 100 for industrial scenarios may include an index graph construction module 101, a terminology resolution and clarification module 102, a query statement conversion module 103, a retrieval verification and push module 104, and a weight feedback and update module 105. The modules described in this invention can also be referred to as units, which are a series of computer program segments that can be executed by the processor of an electronic device and can perform a fixed function, stored in the memory of the electronic device.

[0069] In this embodiment, the functions of each module / unit are as follows: The index map construction module 101 is used to establish a cross-database association mapping and generate a three-dimensional semantic association index map by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors. The terminology resolution and clarification module 102 is used to resolve the terminology entities of natural language question commands from users in industrial production and generate clarification information for the natural language question commands. The query statement conversion module 103 is used to generate the precise question intent of the clarification information based on the user's confirmation and correction feedback on the clarification information, and to convert the precise question intent into an executable cross-database joint query statement based on the three-dimensional semantic association index graph. The retrieval verification push module 104 is used to perform data retrieval and confidence assessment on executable cross-database joint query statements, and push the retrieval results and confidence assessment results to the user for verification. The weight feedback update module 105 is used to update the terminology matching weights of the industrial production professional terminology ontology based on the verification results and user correction feedback.

[0070] In the several embodiments provided by this invention, it should be understood that the disclosed methods and platforms can be implemented in other ways. For example, the platform embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and other division methods may be used in actual implementation.

[0071] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0072] Furthermore, the functional modules in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or in the form of hardware plus software functional modules.

[0073] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the present invention can be implemented in other specific forms without departing from the spirit or essential characteristics of the present invention.

[0074] The embodiments of this application can acquire and process relevant data based on an artificial intelligence technology. Artificial intelligence is the theory, method, technology, and application platform that uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results.

[0075] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method for human-machine collaborative analysis of intelligent question in an industrial scene, characterized in that, The method includes: A1: Using multi-source heterogeneous data labeled with semantic tags in industrial production as the anchor points, establish cross-database association mapping and generate a three-dimensional semantic association index map; A2: Perform term entity resolution on the natural language question commands of users in industrial production to generate clarification information for the natural language question commands; A3: Based on the user's confirmation and correction feedback on the clarification information, generate the precise question intent of the clarification information, and based on the three-dimensional semantic association index graph, convert the precise question intent into an executable cross-database joint query statement; A4: Perform data retrieval and confidence assessment on executable cross-database join queries, and push the retrieval results and confidence assessment results to the user for verification; A5: Based on the verification results and user feedback, update the terminology matching weights of the industrial production terminology ontology.

2. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 1, characterized in that, The method of establishing cross-database association mapping and generating a three-dimensional semantic association index map by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors includes: Semantic tags are used to annotate real-time process parameter data, production batch operating condition data, and quality inspection data of multi-source heterogeneous data in industrial production. Using the production batch identifier commonly contained in the semantic tags as the association anchor, a cross-database association mapping is established between each parameter field in the real-time process parameter data and each detection index field in the quality inspection data. A three-dimensional semantic association index map is generated using the production batch identifier, the parameter fields, and the detection index fields as association dimensions.

3. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 1, characterized in that, The process of performing term entity resolution on natural language query commands from users in industrial production to generate clarification information for the natural language query commands includes: The query entities, query dimensions, and query constraints in the natural language query commands are semantically similar to the professional terminology ontology in industrial production, and the synonymous semantic ambiguities in the natural language query commands are identified. The dimensional attribution of the query entity to the association dimension in the three-dimensional semantic association index graph is determined to identify the ambiguity in the parameter reference of the query entity. The query constraints are matched with the three-dimensional semantic association index graph to identify the points where the query range of the query constraints is unclear. The synonymous semantic ambiguities, the parameter referential ambiguities, and the query range ambiguities are integrated into clarification information that includes a preliminary question intent summary and a list of ambiguities to be confirmed. The clarification information is pushed to the user's interactive interface, waiting for the user to confirm and correct the clarification information.

4. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 1, characterized in that, The process of generating a precise question intent based on user confirmation and correction feedback regarding the clarification information, and transforming this precise question intent into an executable cross-database join query statement based on the three-dimensional semantic association index graph, includes: The system receives user confirmation and correction operations for each ambiguous point in the clarification information, and integrates the unambiguous query entity, query dimension and query constraint to generate a precise query intent. In the three-dimensional semantic association index map, the association path is retrieved for the precise question intent to determine the target data source, target association field and target association path involved in the precise question intent; Based on the target association path, the precise query intent is decomposed into query subtasks for each target data source; Based on the query syntax characteristics of the target data source, the query subtasks are converted into corresponding query statements; Based on the association rules of the target association path, the query statements of each target data source are integrated into an executable cross-database joint query statement.

5. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 1, characterized in that, The process of performing data retrieval and confidence assessment on executable cross-database join queries, and then pushing the retrieval results and confidence assessment results to the user for verification, includes: Based on the cross-database join query statement, data retrieval is performed on the data sources involved in the precise query intent to obtain the retrieval results; The confidence level of the search results is evaluated based on the data integrity dimension, semantic relevance dimension, and timeliness dimension. The search results and the confidence assessment results are pushed to the interactive interface for user verification.

6. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 5, characterized in that, The formula for calculating the confidence level assessment result is as follows: in, The confidence assessment result of the search results. The weighting coefficient for the data integrity dimension. The number of fields that actually returned valid data in the data integrity dimension. The total number of fields involved in the precise questioning intent. The geometrically weighted index for the data integrity dimension. The weight coefficients for the semantic relevance dimension are... The semantic matching ratio is the semantic relevance dimension. It is the geometrically weighted index of the semantic relevance dimension. The weighting coefficients for the timeliness dimension are as follows: The time interval between the latest record time of the data in the search results and the current query time. It is the geometric weighted index of the timeliness dimension.

7. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 5, characterized in that, The step of pushing the search results and the confidence assessment results to the interactive interface for user verification includes: The interactive interface displays the search results, the confidence assessment results, and the visualization information of the association path. The search results are pushed to the interactive interface, where users can manually verify, correct, and annotate the search results, generating feedback for correction.

8. The human-machine collaborative analysis method for intelligent data retrieval in an industrial scenario as described in claim 7, characterized in that, The process of updating the terminology matching weights of the industrial production terminology ontology based on verification results and user feedback includes: The user's confirmation and correction operations for each semantic ambiguity point in the clarification information are used to construct a clarification feedback vector; Users manually verify, correct, and anomaly label the search results to construct a verification feedback vector. Based on the clarification feedback vector and the verification feedback vector, the adjustment amount of the matching weight of each term in the terminology ontology is calculated; Based on the adjustment amount, the matching weights of each term in the terminology ontology are optimized and updated. The formula for calculating the term matching weight is as follows: in, The index number is assigned to each term in the terminology ontology. For index number Terminology, For the updated version Term matching weights for each term. For the first time before the update Term matching weights for each term. The corrected score for the term in the clarification feedback vector. The verification feedback score for the term in the verification feedback vector. The influence coefficient of the clarification feedback vector on the term matching weight update. The influence coefficient of the verification feedback vector on the term matching weight update.

9. The human-machine collaborative analysis method for intelligent data querying in an industrial scenario as described in claim 8, characterized in that, The step of calculating the adjustment amount of each term matching weight in the terminology ontology based on the clarification feedback vector and the verification feedback vector includes: A correction consistency analysis is performed on the user correction terms in the clarification feedback vector to obtain the correction score of the clarification feedback vector for each term; A data correction magnitude analysis is performed on the user verification retrieval results in the verification feedback vector to obtain the verification feedback score for each term; The adjusted score is obtained by weighting and summing the corrected score and the verification feedback score.

10. A human-machine collaborative analysis platform for intelligent data querying in industrial scenarios, characterized in that, The platform for implementing the human-machine collaborative analysis method for intelligent data querying in an industrial scenario as described in claim 1 includes: The index graph construction module is used to establish cross-database association mapping and generate a three-dimensional semantic association index graph by using multi-source heterogeneous data labeled with semantic tags in industrial production as association anchors. The terminology resolution and clarification module is used to resolve the terminology entities of natural language question commands from users in industrial production and generate clarification information for the natural language question commands. The query statement conversion module is used to generate the precise question intent of the clarification information based on the user's confirmation and correction feedback, and to convert the precise question intent into an executable cross-database joint query statement based on the three-dimensional semantic association index graph. The retrieval verification push module is used to perform data retrieval and confidence assessment on executable cross-database joint query statements, and push the retrieval results and confidence assessment results to users for verification. The weight feedback update module is used to update the term matching weights of the professional terminology ontology in industrial production based on the verification results and user correction feedback.