A method and system for selecting and integrating data within multiple datasets

By filtering and merging the connection identifiers and integration attributes of multiple datasets, the problem of filtering and merging multiple datasets that cannot be handled in existing technologies is solved, and efficient visualization of monitoring data for multiple processes is achieved.

CN116467355BActive Publication Date: 2026-06-30MINGDU ZHIYUN (ZHEJIANG) TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MINGDU ZHIYUN (ZHEJIANG) TECH CO LTD
Filing Date
2023-02-13
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing dataset processing methods can only filter and merge between two single datasets, and cannot effectively handle more than two related datasets, thus failing to meet the need for visual presentation of changes in monitoring data across multiple processes.

Method used

By obtaining the connection identifiers and integration attributes of multiple datasets, the dataset queue is split into multiple dataset groups arranged in sequence using a depth-first search algorithm. The field parameter values ​​are then filtered and merged based on the integration attributes of the connection identifiers to finally generate an integrated dataset.

Benefits of technology

It enables efficient filtering and merging of multiple datasets, meets the need for visual presentation of changes in monitoring data across multiple processes, and improves the efficiency of data analysis.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116467355B_ABST
    Figure CN116467355B_ABST
Patent Text Reader

Abstract

This invention discloses a method and system for selecting and integrating data within multiple datasets. It obtains the field attributes and parameter values ​​of multiple datasets connected by line identifiers, reads the integration attributes of each line identifier, splits the connection queue formed by the datasets, and groups them into multiple datasets arranged sequentially. According to a preset merging order, adjacent datasets are sequentially filtered based on the integration attributes of their line identifiers and their field parameter values, and then merged into a branch dataset. This branch dataset is then filtered and merged with the next dataset until the entire dataset is combined to form an integrated dataset. This method achieves the filtering and merging of fields within the connection queues of multiple datasets with branches.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of information technology, and in particular to a method and system for selecting and integrating data within multiple datasets. Background Technology

[0002] When conducting data analysis, multiple datasets are often filtered and merged to output a new dataset that meets the requirements for subsequent data analysis and presentation. For example, in current production process data analysis, to better present the comparative changes in monitoring data across multiple processes, it is often necessary to filter and merge multiple datasets into a single target dataset. Existing dataset processing methods involve setting content filtering rules for two datasets, then filtering and merging them to create a target dataset that meets the requirements. However, this filtering and merging method typically only works between a single dataset and cannot filter and merge more than two related datasets. Summary of the Invention

[0003] This invention addresses the shortcomings of existing technologies by providing a method for selecting and integrating data from multiple datasets, comprising the following steps:

[0004] S1, obtain the field attributes and field parameter values ​​of multiple datasets that are interconnected by connection markers, and read the integrated attributes of each connection marker;

[0005] S2, split the connection queue formed by the connection of each dataset into multiple dataset groups arranged in sequence. According to the preset merging order, adjacent datasets are filtered by the field parameter values ​​of the integration attribute of their connection identifier and merged into a branch dataset. Then, the branch dataset is filtered and merged with the next dataset until the entire dataset is merged.

[0006] S3: Obtain the integration attribute and connection field of the connection identifiers connecting each dataset group. After integrating the dataset groups, the branch datasets are merged into an integrated dataset after filtering the field parameter values ​​according to the integration attribute and connection field of the corresponding connection identifiers.

[0007] Preferably, step S2 further includes:

[0008] Obtain a starting data table node of the dataset queue according to the set direction. From the starting data table node, traverse each data table node to the end data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, thus forming a queue connection path.

[0009] Based on the queue connection path, retrieve and record the association relationship between the last data table node and each data table in reverse order, forming at least one dataset group arranged in a sequential sequence.

[0010] Preferably, step S2 further includes: at least one dataset in the dataset queue is connected to multiple different datasets on one side through different connection identifiers, wherein multiple dataset groups are formed by reversing the data table node from the end according to the queue connection path and recording its association with each data table.

[0011] Preferably, step S3 further includes: obtaining the multi-connection identifier merging rule in the integration attribute of the connection identifiers in the branch dataset, and merging multiple dataset groups into an integrated dataset after filtering the field parameter values ​​according to the multi-connection identifier merging rule, wherein the branch dataset is a dataset with multiple connection identifiers connected on the same side.

[0012] Preferably, step S3 further includes: generating a view result SQL statement based on the formed integrated dataset, validating and parsing the generated view result SQL statement according to the syntax of the SQL statement, and generating a visualization report.

[0013] This invention also discloses a data selection and integration system within multiple datasets, comprising: a field acquisition module, used to acquire the field attributes and field parameter values ​​of multiple datasets interconnected by connection markers, and read the integration attributes of each connection marker; a decomposition module, used to split the connection queue formed by the connection of each dataset into multiple dataset groups arranged sequentially, and to merge adjacent datasets into a branch dataset by filtering the field parameter values ​​of the integration attributes of their connection markers according to a preset merging order, and then filtering and merging the branch dataset with the next dataset until the entire dataset group is merged; and an integration module, used to acquire the integration attributes and connection fields of the connection markers connecting each dataset group, and to merge the branch datasets formed after the dataset groups are integrated into an integrated dataset by filtering the field parameter values ​​of the corresponding connection markers.

[0014] Preferably, the decomposition module is further configured to obtain a starting data table node of the dataset queue in a set direction, traverse each data table node from the starting data table node to the last data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, and form a queue connection path; and obtain and record the association relationship between the last data table node and each data table in reverse from the queue connection path to form at least one dataset group arranged in a sequential order.

[0015] Preferably, in the dataset queue, at least one dataset is connected to multiple different datasets on one side through different connection identifiers. The datasets are arranged in a series by reversing the connection path from the end data table node and recording their association with each data table.

[0016] The present invention also discloses a data selection and integration device for multiple datasets, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of any of the methods described above.

[0017] The present invention also discloses a computer-readable storage medium storing a computer program, characterized in that: when the computer program is executed by a processor, it implements the steps of any of the methods described above.

[0018] This invention discloses a method and system for selecting and integrating data within multiple datasets. It splits a connection queue consisting of multiple interconnected datasets into several dataset groups arranged sequentially. Adjacent datasets are then filtered and merged according to their connection identifiers based on a preset merging order, forming branch datasets. Finally, these branch datasets are filtered and merged to form a single integrated dataset. This method enables the filtering and merging of fields within the connection queue of multiple branched datasets. It achieves more efficient filtering and merging of multiple datasets, meeting data analysis needs in scenarios such as visualizing changes in monitoring data across multiple processes.

[0019] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0020] The accompanying drawings, which are included to provide a further understanding of the invention and form part of this application, illustrate exemplary embodiments of the invention and, together with their description, serve to explain the invention and do not constitute an undue limitation thereof. In the drawings:

[0021] Figure 1 This is a flowchart illustrating the data selection and integration method for multiple datasets disclosed in this embodiment.

[0022] Figure 2 This is a schematic diagram of multiple datasets connected by lines as disclosed in this embodiment.

[0023] Figure 3 This is a schematic diagram of the specific process of step S2 disclosed in this embodiment. Detailed Implementation

[0024] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention without creative effort are within the scope of protection of the present invention.

[0025] Unless otherwise defined, the technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention pertains. The terms “first,” “second,” and similar terms used in the specification and claims of this patent application do not indicate any order, quantity, or importance, but are merely used to distinguish different components. Similarly, the terms “an” or “a” and similar terms do not indicate a limitation of quantity, but rather indicate the presence of at least one.

[0026] Currently, when conducting production process data analysis, in order to better present the comparative changes in monitoring data across multiple processes, it is often necessary to filter and merge multiple datasets into a single target dataset. To achieve this, this embodiment discloses a method for selecting and integrating data from multiple datasets, as detailed in the attached figure. Figure 1 As shown, the method includes the following:

[0027] Step S1: Obtain the field attributes and field parameter values ​​of multiple datasets that are interconnected by connection identifiers, and read the integrated attributes of each connection identifier.

[0028] The production management system includes pre-built datasets for most of the production data, allowing users to easily select from them. Users can choose datasets based on their needs, placing desired data fields from the database into one dataset, or they can create new datasets randomly according to their requirements.

[0029] For example, attached Figure 2 One dataset contains deviation record data, including fields such as deviation identifier, programming code, deviation status, product code, product name, and product specifications. The other dataset contains batch deviation data, including fields such as deviation identifier, product batch number, work order number, and product name. In this embodiment, the datasets contain multiple fields arranged vertically to reflect the characteristics of each production process. Each feature field in the dataset is associated with corresponding feature data in the target object database.

[0030] Data sets are linked using connection markers, with each end of the marker connecting target fields located within the two datasets. These connection markers are configured with integration attributes, including field selection rules and dataset combination relationships. The selection rules are the comparison and filtering logic for the parameter values ​​of the record fields at both ends of the connection. The combination relationship is the field filtering rule to be output after combining the two connected datasets. The selection rules for the parameter values ​​of the target fields at both ends of the connection include, but are not limited to, the parameter values ​​of the two target fields being the same, the parameter value of the first target field being greater than that of the second target field, or the parameter value of the first target field being less than that of the second target field. The dataset combination relationship can include all fields of the associated datasets, be based on fields from the first dataset at the connection end, be based on fields from the second dataset at the connection end, or be based on the target field of the connection.

[0031] In another embodiment, multiple connections can be established between the two datasets. Each connection has selection rules and a priority level. The connections between the two datasets can be filtered and sorted according to the priority level. The parameter values ​​of the fields at both ends of the connection are compared and selected sequentially according to the selection rules. The field parameter values ​​that meet the requirements are saved as filter values ​​and entered into the field parameter value filter library.

[0032] By obtaining the priority level from the attribute information attached to each connection, the fields at both ends of the connections with the same priority level are filtered according to the judgment selection rules, and the field parameter values ​​corresponding to the judgment selection rules for multiple connections that simultaneously meet the same priority level are entered into the field parameter value filtering library.

[0033] For connections with different priority levels, they are filtered and sorted according to priority level. Only the selection rules corresponding to the highest priority connections are processed, and the field parameter values ​​that meet the requirements are added to the feature field parameter value filtering library as filter values. For connections with different priority levels, if one end of multiple connections is connected to the same feature field in the same dataset, then the feature field parameter values ​​that simultaneously meet the selection rules corresponding to these multiple connections are entered into the feature field parameter value filtering library.

[0034] If the attached attribute information of a connection does not have a priority level, it is assigned the highest priority level and participates in the filtering and sorting process of each connection identifier. The parameter values ​​of the feature fields at both ends of the connection identifier are compared and selected according to the judgment and selection rules of each connection identifier. The field parameter values ​​that meet the requirements are saved as the filtering values ​​and entered into the field parameter value filtering library.

[0035] Step S2: The connection queue formed by the connection of each dataset is split into multiple dataset groups arranged in sequence. According to the preset merging order, adjacent datasets are filtered by the field parameter values ​​of the integration attribute of their connection identifier and merged into a branch dataset. Then, the branch dataset is filtered and merged with the next dataset until the entire dataset is merged.

[0036] When the object to be merged has three or more interconnected datasets, especially when one dataset has one or more fields connecting to two or more other datasets, it is not possible to directly filter and merge adjacent datasets. It is necessary to split the connection queue formed by these dataset connections into multiple sequentially arranged dataset groups, and then filter and merge each dataset group separately, as shown in the appendix. Figure 3 As shown, it can specifically include the following contents.

[0037] Step S21: Obtain a starting data table node of the dataset queue in the set direction. From the starting data table node, traverse each data table node to the end data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, forming a queue connection path.

[0038] Step S22: Based on the queue connection path, retrieve and record the association relationship between the data table node at the end and each data table to form at least one dataset group arranged in a sequential order.

[0039] In this embodiment, step S2 further includes: at least one dataset in the dataset queue is connected to multiple different datasets on one side through different connection identifiers, wherein multiple dataset groups are formed by reversing the data table node from the end according to the queue connection path and recording its association with each data table.

[0040] Specifically, the relationships between multiple datasets can be viewed as a graph structure, with each dataset acting as a node. Starting from the first node, a depth-first search algorithm is used to continuously find the last leaf node. From the last leaf node, the relationships between it and other datasets are retrieved (1 inner join 2 left join 3 right join 4 full join). The relationships between fields of nodes are combined using logical AND operations. When a node is processed, the path from the depth-first search algorithm is backtracked to its previous node, and the relationships between its previous node and all its nodes are continuously summed to obtain a union of the node itself and all its relationships. This process is repeated recursively until the SQL relational network of the entire graph is obtained.

[0041] Step S3: Obtain the integration attribute and connection field of the connection identifiers connecting each dataset group. After integrating each dataset group, the branch datasets are merged into an integrated dataset after filtering the field parameter values ​​according to the integration attribute and connection field of the corresponding connection identifiers.

[0042] In this embodiment, step S3 further includes: obtaining the multi-connection identifier merging rule in the integration attribute of the connection identifiers in the branch dataset, and merging multiple dataset groups into an integrated dataset after filtering the field parameter values ​​according to the multi-connection identifier merging rule, wherein the branch dataset is a dataset with multiple connection identifiers connected on the same side.

[0043] Step S3 further includes: generating a view result SQL statement based on the formed integrated dataset, validating the generated view result SQL statement according to the syntax of the SQL statement, and generating a visualization report.

[0044] In this embodiment, SQL statements automatically generated by the system or edited by the user can be validated according to the syntax of the SQL statement. The final SQL statement is analyzed using a self-developed lexical analysis method. After successful validation, it can be directly used by system reports or third-party external systems. Through lexical analysis, based on the principle of keyword priority, user-defined SQL statements are parsed and validated, which may specifically include the following:

[0045] Create a keyword library for SQL syntax, forming a collection of all SQL keywords and functions. Separate user-defined SQL statements using spaces as delimiters to create a linked list of SQL statements.

[0046] Perform keyword or function matching on the characters in the linked list, and mark the matched characters with the keyword or function. For any word string, based on the set of keywords and functions, provide suggestions based on the highest match score; for example, if the original string is "sam", suggest whether it is "sum".

[0047] According to the rules of the SQL statement, the next node in the linked list of marked strings is validated. However, two keywords cannot be consecutive. If the keyword is a function expression, the function parameters in the statement linked list are searched, and if the parameters do not meet the requirements, a prompt is given. Finally, the entire SQL statement for creating the view is executed.

[0048] This invention discloses a method for selecting and integrating data within multiple datasets. It splits a connection queue consisting of multiple interconnected datasets into several sequentially arranged dataset groups. Adjacent datasets are then filtered and merged according to their connection identifiers based on a preset merging order, forming branch datasets. Finally, these branch datasets are filtered and merged to form a single integrated dataset. This method enables the filtering and merging of fields within the connection queue of multiple branched datasets. It achieves more efficient filtering and merging of multiple datasets, meeting data analysis needs in scenarios such as visualizing changes in monitoring data across multiple processes.

[0049] In another embodiment, a data selection and integration system within multiple datasets is also disclosed, comprising: a field acquisition module, used to acquire the field attributes and field parameter values ​​of multiple datasets interconnected by connection markers, and read the integration attributes of each connection marker; a decomposition module, used to split the connection queue formed by the connection of each dataset into multiple dataset groups arranged sequentially, and to merge adjacent datasets into a branch dataset by filtering the field parameter values ​​of the integration attributes of their connection markers according to a preset merging order, and then to filter and merge the branch dataset with the next dataset until the entire dataset group is merged; and an integration module, used to acquire the integration attributes and connection fields of the connection markers connecting each dataset group, and to merge the branch datasets formed after the dataset groups are integrated into an integrated dataset by filtering the field parameter values ​​of the corresponding connection markers.

[0050] In this embodiment, the decomposition module is further configured to obtain a starting data table node of the dataset queue in a set direction, traverse each data table node from the starting data table node to the last data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, and form a queue connection path; and obtain and record the association relationship between the last data table node and each data table in reverse from the queue connection path to form at least one dataset group arranged in a sequential order.

[0051] In the dataset queue, there is at least one dataset connected to multiple different datasets on one side through different connection identifiers. The datasets are arranged in a series by retrieving and recording the association relationship between the datasets and each data table from the end of the queue connection path.

[0052] It should be noted that the various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. Regarding the multi-dataset data selection and integration system disclosed in the embodiments, since it corresponds to the multi-dataset data selection and integration method disclosed in the embodiments, the description is relatively simple, and relevant parts can be referred to the method section.

[0053] In other embodiments, a multi-dataset data selection and integration apparatus is also provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the multi-dataset data selection and integration method as described in the above embodiments.

[0054] The data selection and integration device within a multi-dataset may include, but is not limited to, a processor and a memory. Those skilled in the art will understand that the schematic diagram is merely an example of a data selection and integration device within a multi-dataset and does not constitute a limitation on the device. It may include more or fewer components than illustrated, or combine certain components, or use different components. For example, the data selection and integration device within a multi-dataset may also include input / output devices, network access devices, buses, etc.

[0055] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or any conventional processor. This processor is the control center of the multi-dataset intra-data selection and integration device, connecting various parts of the device via various interfaces and lines.

[0056] The memory can be used to store the computer program and / or modules. The processor, by running or executing the computer program and / or modules stored in the memory, and by calling the data stored in the memory, realizes various functions of the multi-dataset data selection and integration device. The memory may mainly include a program storage area and a data storage area. The program storage area may store the operating system, at least one application program required for a function, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as hard disk, memory, plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, at least one disk storage device, flash memory device, or other volatile solid-state storage device.

[0057] If the multi-dataset intra-data selection and integration device is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the above-described embodiments of the present invention can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various multi-dataset intra-data selection and integration method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a portable hard drive, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM), a random access memory (RAM), an electrical carrier signal, a telecommunication signal, and a software distribution medium, etc. It should be noted that the content contained in the computer-readable medium may be appropriately added to or subtracted from the content as required by the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, the computer-readable medium may not include electrical carrier signals and telecommunication signals.

[0058] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

[0059] In summary, the above description is only a preferred embodiment of the present invention. All equivalent changes and modifications made within the scope of the claims of the present invention should be covered by the present invention.

Claims

1. A method for selecting and integrating data within multiple datasets, characterized in that, Includes the following steps: S1, obtain the field attributes and field parameter values ​​of multiple datasets that are interconnected by connection markers, and read the integrated attributes of each connection marker; By obtaining the priority level from the attribute information attached to each connection, the fields at both ends of the connection with the same priority level are filtered according to the judgment selection rules, and the field parameter values ​​corresponding to the judgment selection rules of multiple connections that simultaneously meet the same priority level are entered into the field parameter value filtering library. For connections with different priority levels, they are filtered and sorted according to priority level. Only the judgment and selection rules corresponding to the highest priority connection are processed, and the field parameter values ​​that meet the requirements are added to the feature field parameter value filtering library as filter values. For connections with different priority levels, if one end of multiple connections is connected to the same feature field in the same dataset, the feature field parameter values ​​that simultaneously meet the judgment and selection rules corresponding to the multiple connections are entered into the feature field parameter value filtering library. S2, split the connection queue formed by the connection of each dataset into multiple dataset groups arranged in sequence. According to the preset merging order, adjacent datasets are filtered by the field parameter values ​​of the integration attribute of their connection identifier and merged into a branch dataset. Then, the branch dataset is filtered and merged with the next dataset until the entire dataset is merged. S3: Obtain the integration attribute and connection field of the connection identifiers connecting each dataset group. After integrating the dataset groups, the branch datasets are merged into an integrated dataset after filtering the field parameter values ​​according to the integration attribute and connection field of the corresponding connection identifiers.

2. The method for selecting and integrating data within multiple datasets according to claim 1, characterized in that, Step S2 further includes: Obtain a starting data table node of the dataset queue according to the set direction. From the starting data table node, traverse each data table node to the end data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, thus forming a queue connection path. Based on the queue connection path, retrieve and record the association relationship between the last data table node and each data table in reverse order, forming at least one dataset group arranged in a sequential sequence.

3. The method for selecting and integrating data within multiple datasets according to claim 2, characterized in that, Step S2 further includes: The dataset queue contains at least one dataset connected to multiple different datasets on one side via different connection identifiers. Multiple dataset groups are formed by reversing the data table node from the end of the queue according to the connection path and recording its association with each data table.

4. The method for selecting and integrating data within multiple datasets according to claim 3, characterized in that, Step S3 further includes: Obtain the multi-connection identifier merging rule in the integration attribute of the connection identifiers in the branch dataset, and merge multiple dataset groups into an integrated dataset after filtering the field parameter values ​​according to the multi-connection identifier merging rule. The branch dataset is a dataset with multiple connection identifiers connected on the same side.

5. The method for selecting and integrating data within multiple datasets according to claim 4, characterized in that, Step S3 further includes: generating a view result SQL statement based on the formed integrated dataset, validating the generated view result SQL statement according to the syntax of the SQL statement, and generating a visualization report.

6. A data selection and integration system within multiple datasets, characterized in that, include: The field acquisition module is used to acquire the field attributes and field parameter values ​​of multiple datasets interconnected by connection markers, and read the integrated attributes of each connection marker. By acquiring the priority level in the attribute information attached to each connection, the modules filter the fields at both ends of connections with the same priority level according to the judgment selection rules, and enter the field parameter values ​​corresponding to the judgment selection rules of multiple connections that simultaneously meet the same priority level into the field parameter value filtering library. For connections with different priority levels, the modules filter and sort according to priority level, only process the judgment selection rules corresponding to the highest priority connection, and add the field parameter values ​​that meet the requirements as filtering values ​​to the feature field parameter value filtering library. For connections with different priority levels, if one end of multiple connections is connected to the same feature field in the same dataset, the feature field parameter values ​​that simultaneously meet the judgment selection rules corresponding to the multiple connections are entered into the feature field parameter value filtering library. The decomposition module is used to split the connection queue formed by the connection of each dataset into multiple dataset groups arranged in sequence. According to the preset merging order, adjacent datasets are filtered by the field parameter values ​​of the integration attribute of their connection identifier and merged into a branch dataset. Then, the branch dataset is filtered and merged with the next dataset until the entire dataset is merged. The integration module is used to obtain the integration attributes and connection fields of the connection identifiers connecting each dataset group. After integrating the dataset groups, the branch datasets are filtered according to the field parameter values ​​of the corresponding connection identifiers and the connection fields, and then merged to form the integrated dataset.

7. The multi-dataset intra-data selection and integration system according to claim 6, characterized in that, The decomposition module is further configured to obtain a starting data table node of the dataset queue in a set direction, and traverse each data table node from the starting data table node to the last data table node in the dataset queue using a depth-first algorithm based on the connection relationship of each data table, forming a queue connection path; and obtain and record the association relationship between the last data table node and each data table in reverse from the queue connection path, forming at least one dataset group arranged in a sequential order.

8. The multi-dataset intra-data selection and integration system according to claim 7, characterized in that: The dataset queue contains at least one dataset connected to multiple different datasets on one side via different connection identifiers. Multiple dataset groups are formed by reversing the data table node from the end of the queue according to the connection path and recording its association with each data table.

9. A data selection and integration device for multiple datasets, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that: When the processor executes the computer program, it implements the steps of the method as described in any one of claims 1-6.

10. A computer-readable storage medium storing a computer program, characterized in that: When the computer program is executed by a processor, it implements the steps of the method as described in any one of claims 1-6.