An artificial intelligence-based industry data management and sharing platform
By constructing the iDS3 industrial data space based on data lake and data weaving, unified management and sharing of data have been achieved, solving the problems of data quality, sharing and integration, and improving the efficiency and security of the data management platform.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA CLOUD OPEN SOURCE DATA TECH (SHANGHAI) CO LTD
- Filing Date
- 2023-06-25
- Publication Date
- 2026-06-12
AI Technical Summary
Existing data management and sharing platforms have shortcomings in areas such as data quality assurance, inter-enterprise data sharing, data fusion and integration, and data security and governance, resulting in the ineffective utilization of data assets and difficulty in maximizing profits.
Construct a data management platform based on a data lake, combine it with the decentralized, distributed, symbiotic and shared industrial data space iDS3 with a data weaving architecture, add a trusted data transaction mechanism, realize unified collection, modeling, storage and sharing of metadata, and conduct integrated governance, security and auditing of data through metadata DIKube management, and build data applications.
It enables full lifecycle management of data, ensures data quality, supports various types of queries, realizes ready-to-use data applications, solves the problem of data silos, improves the efficiency of data sharing and integration, and reduces management costs.
Smart Images

Figure CN116775605B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent manufacturing, and in particular to an industrial data management and sharing platform based on artificial intelligence. Background Technology
[0002] Existing open-source technologies, methods, and architectures mainly include the following four types:
[0003] 1) Data Lake
[0004] A data lake is a big data platform capable of storing, querying, and cross-analyzing unstructured and structured data across diverse data structures, possessing data management, governance, and assetization capabilities. All data in a data lake can be stored in its original format without size limitations. Each data element in a data lake has a unique identifier, which can be used to locate the element. Common data management components for implementing a data lake include data access, data migration, data governance, quality management, asset cataloging, access control, task management, task orchestration, and metadata management. Lakehouse integration is a new data architecture that combines the advantages of both data warehouses and data lakes, merging their differences. By building the data warehouse on top of the data lake, storage becomes more cost-effective and flexible. Lakehouse integration effectively improves data quality and reduces data redundancy. In lakehouse integration construction, ETL (Extract-Transform-Load) transforms unprocessed data from the data lake layer into structured data from the data warehouse layer. It features unified data management, multimodal storage engines, rich data engines, a converged data platform, and application-on-demand modeling.
[0005] 2) Data Space
[0006] Data Space is an information system based on heterogeneous data sources, designed to support intelligent data applications. Data Space can utilize different technologies depending on the application scenario, thus adapting to various access methods and providing different levels of service. By constructing a unified description of the data, Data Space integrates data from multiple data sources, focusing more on discovering data relationships than managing data. Therefore, Data Space emphasizes supporting intelligent decision-making through understanding data. Its "on-demand" functionality, through collecting feedback during use, provides users with an increasingly better experience and enables Data Space to adapt to new data sources. Data Space can have different structural designs in different working environments.
[0007] 3) Data weaving
[0008] Data weaving is a design concept that serves as an integration layer for data and connectivity processes; it's a description of a data architecture. It leverages continuous metadata analysis to support the design, deployment, and utilization of integrable and reusable data across environments, including hybrid and multi-cloud platforms. Data Fabric is a product of the evolution of data architecture. In addition to inheriting existing data integration and data governance technologies, data weaving possesses several new capabilities, primarily including embedded artificial intelligence and machine learning, semantic knowledge graphs, automated repetitive tasks, active metadata, dynamic data integration, data cataloging, and automated data orchestration.
[0009] 4) Data Grid
[0010] A data grid is a solution architecture designed to build business-centric data products. It guides the design of solution architectures within a technology-agnostic framework. As a solution, the fundamental requirements of a data grid include products as data, distributed data governance, and empowerment. Essentially, a data grid solution organizes data around business domain owners and transforms relevant data sources into data products that serve distributed business users from different business domains or functions. These data products can be created, managed, and used in an autonomous, decentralized, and self-service manner.
[0011] The shortcomings of existing open-source technologies, methods, and architectures mentioned above:
[0012] 1) As a scalable infrastructure for big data storage, processing, and analysis, the most significant drawback of data lakes is their inability to guarantee the quality of acquired data. The lake-warehouse integrated data architecture is still a combination of data warehouse and data lake architectures, suffering from high complexity, high management costs, and significant data security challenges. Secondly, enterprise lake-warehouse integrated architectures are often business-specific and lack universality, failing to form a complete architectural system. Issues include a lack of a complete data lifecycle, unclear definitions of storage objects, the need for optimized data governance capabilities, and the need to supplement supporting application software technologies. Furthermore, currently, enterprise data is mostly centrally managed by a central data team, lacking efficient data sharing mechanisms between different business areas, easily leading to data silos.
[0013] 2) Data space management systems still have shortcomings in three aspects: data modeling and querying, local storage and indexing, and data space discovery. Current data modeling cannot handle different levels of uncertainty, cannot populate data in the data space in a ready-to-use manner, cannot efficiently support all types of queries, such as keyword queries, structured queries, and metadata queries, lacks efficient and universal local storage and indexing components, and existing data space management systems cannot locate the relationships between new participants and existing data sources.
[0014] 3) Data weaving is a technological infrastructure for accessing and moving data, but it leans towards data access rather than data management. When creating replicable, reusable data streams, as well as replicable, reusable data cleaning and conditioning routines, the priorities of managing and accessing data can easily conflict, making it difficult to achieve immediate usability. Furthermore, data weaving focuses on addressing the increasing diversity, distribution, scale, and complexity of data assets within enterprises, but it cannot solve the problem of data sharing between enterprises.
[0015] 4) From the perspective of a data management platform, a data grid is a downstream application of data products. From the perspective of application domains, a data grid cannot combine data into an OLAP engine or create a Data Window (DWH).
[0016] Therefore, it is necessary to improve such existing technologies to overcome the aforementioned shortcomings. Summary of the Invention
[0017] The purpose of this invention is to provide an AI-based industrial data management and sharing platform to enable data sharing among enterprises and provide services such as digital transformation, intelligent upgrading, and integrated innovation. This addresses the problem of data management being misaligned with application needs in existing platforms such as data lakes, lake warehouses, data spaces, data weaving, and data grids. This misalignment leads to situations where an enterprise's data assets cannot be effectively utilized when its application needs are unclear, making it difficult to maximize profits from the costs invested in data management.
[0018] The above-mentioned technical objective of the present invention is achieved through the following technical solution:
[0019] An artificial intelligence-based industrial data management and sharing platform includes the following steps:
[0020] 1) Build a data management platform based on the data lake;
[0021] 2) Based on the data management platform, create a decentralized, distributed, symbiotic, and shared industrial data space iDS3 based on a data weaving architecture;
[0022] 3) Data ingestion from data sources; importing data from various databases, message queues, and file storage into the industrial data space iDS3 for unified data management;
[0023] 4) Add a trusted data transaction mechanism to the industrial data space iDS3;
[0024] 5) Enhanced metadata management; combining the capabilities of data lake and data weaving to achieve unified collection, modeling, storage, and sharing of metadata;
[0025] 6) Metadata-based DIKube management; pre-generating metadata-based DIKube based on industry mechanisms;
[0026] 7) Integrated data governance; connecting the data lake and application layer through iDS3 to achieve metadata and data management;
[0027] 8) Data security and auditing; through audit logs and data quality checks, monitor the data source and data quality of key ETL tasks in the data lake from three aspects: data writing, data consistency, and data source orchestration and scheduling;
[0028] 9) Build data applications.
[0029] Furthermore, the steps for building a data management platform based on a data lake are as follows:
[0030] 1) Construct a distributed big data cluster environment;
[0031] 2) Install the cluster visualization management plugin and environment deployment dependency plugins;
[0032] 3) Deploy the storage data volume management component, the distributed object storage component MinIO, component services, the data query component Trino, and the BI tool Superset.
[0033] Furthermore, the industrial data space iDS3 is used for ubiquitous storage of metadata, which adopts the DIKube data storage method; each data tensor in DIKube stores only metadata and its corresponding unique ID.
[0034] Furthermore, the data trust transaction mechanism includes the following three types:
[0035] 4.1) A trustworthy mechanism for data transactions is established through unique IDs and blockchain technology;
[0036] 4.2) Trace the operation history of data through audit records;
[0037] 4.3) Backtrack to a specified historical version; based on fully recording data characteristics, generate an identifier ID from the extracted metadata and tags, and store it in the industrial data tensor iDIKube to establish a ubiquitous data governance mechanism.
[0038] Furthermore, the steps for metadata enhancement management are as follows:
[0039] 5.1) Extract relevant instances, entities, and relationships from multi-source heterogeneous data based on semantics to achieve RDF-based data assimilation;
[0040] 5.2) Construct an industry data catalog and an industry data coding system zyxID. Based on industrial mechanisms, the data is summarized and mapped to nodes in the industrial lexigraph iLexiGraph, and the semantics of the data are enriched by the association between the lexigraph nodes.
[0041] 5.3) Based on the complete metadata provided by Lakehouse, the metadata is aggregated and integrated in iDS3, and management and sharing are realized;
[0042] 5.4) Combining the table information in the data lake with the semantic relationships of data assets established by iDS3, the change trajectories of the data platform and external systems are correlated.
[0043] Furthermore, the steps for integrated data governance are as follows:
[0044] 7.1) Ensure data quality of the data platform based on the data lake, and use data weaving to connect the data platform with external systems;
[0045] 7.2) Relying on iDS3's data security and lifecycle management mechanism, manage the data access and read / write processes in the data lake;
[0046] 7.3) Analyze and mine the historical access and change information of the data lake through audit logs and compliance management.
[0047] Furthermore, the implementation methods for data security and auditing are as follows:
[0048] 8.1) Use iDS3 to extract data from the data source, and use the data lake to write the extracted data into the data management platform;
[0049] 8.2) Based on the transactionality and data quality mechanisms of the data lake, simplify the data consistency of iDS3 in ETL / ELT;
[0050] 8.3) Implement the orchestration and scheduling of distributed ETL / ELT tasks based on iDS3, and load data from the source to the data lake.
[0051] Furthermore, the steps for constructing the data application are as follows:
[0052] 9.1) Use data lake metadata as the data source in iDS3;
[0053] 9.2) Use iDS3 in the constructed data management platform to simplify data analysis processes and steps;
[0054] 9.3) Utilize the ACID transaction mechanism of the data lake to enable iDS to provide high-quality analytical data sources;
[0055] 9.4) Add a trusted data transaction mechanism to iDS3 and implement the iDS3 connector for each organization;
[0056] 9.5) All organizations exchange information in accordance with the methods specified in the Data Space Rules Manual;
[0057] 9.6) Provide users with industry-oriented, scenario-driven, on-demand, real-time data application services and interactive massive dataset analysis and display.
[0058] In summary, the present invention has the following beneficial effects:
[0059] 1) Data security and auditing ensure the data quality in the industrial data management platform from three aspects: data writing, data consistency, and data source orchestration and scheduling.
[0060] 2) The trusted data transaction mechanism and metadata enhancement management enable the management of the entire data lifecycle.
[0061] 3) RDF-based data assimilation and iLexiGraph, a lexigraph built based on industrial mechanisms, can solve the problem of supporting various types of queries.
[0062] 4) Metadata governance based on iDS3 enables ready-to-use metadata applications.
[0063] 5) A knowledge graph with scalable industry mechanisms can enable generative application scenarios based on artificial intelligence when users are added. Attached Figure Description
[0064] Figure 1 This is a schematic diagram of the artificial intelligence-based industrial data management and sharing platform described in this invention.
[0065] Figure 2 The industrial data management platform described in this invention is used for data sharing among enterprises.
[0066] Figure 3 This is a diagram illustrating the implementation steps of the generative application scenario based on artificial intelligence as described in this invention. Detailed Implementation
[0067] To make the technical means, creative features, objectives and effects of this invention easier to understand, the invention will be further described below with reference to the figures and specific embodiments.
[0068] The present invention proposes an artificial intelligence-based industrial data management and sharing platform, comprising the following steps:
[0069] I. Building a data management platform based on a data lake
[0070] Building a data management platform based on a data lake architecture refers to constructing a unified lake-warehouse data management platform capable of managing data. The steps are as follows:
[0071] 1) Construct a distributed big data cluster environment;
[0072] 2) Install the cluster visualization management plugin and environment deployment dependency plugins;
[0073] 3) Deploy the storage data volume management component, the distributed object storage component MinIO, component services, the data query component Trino, and the BI tool Superset.
[0074] II. Building upon the data management platform constructed in Step I, create a decentralized, distributed, symbiotic, and shared industrial data space, iDS3, based on a data weaving architecture.
[0075] The iDS3 data space created through the data management platform has two key functions: 1) It maintains ubiquitous storage of metadata. This saves on the expensive costs of data integration and ensures enterprise data security; 2) DIKube's data storage method. Each data tensor in DIKube stores only metadata and its corresponding unique ID. The ID is encoded based on industrial mechanisms, and different applications have different encoding rules, ensuring data symbiosis and sharing within or between enterprises, and solving the problem that native data weaving architectures cannot support real-time, on-demand data fusion and integration.
[0076] III. Data Ingestion from Data Sources
[0077] Use a self-developed connector to synchronize data from various databases, message queues, and file storage to iDS3 for unified data management.
[0078] IV. Add a trusted data transaction mechanism to iDS3
[0079] First, three trusted transaction mechanisms are added: 1) a mechanism to ensure the trustworthiness of data transactions through unique IDs and blockchain technology; 2) a mechanism to trace the operation history of data through audit records; and 3) a mechanism to trace back to a specified historical version. Then, based on fully recording data characteristics, the extracted metadata and tags are used to generate identifier IDs, which are then stored in the industrial data tensor iDIKube to establish a ubiquitous data governance mechanism.
[0080] V. Enhanced Metadata Management
[0081] Metadata enhancement management refers to combining the capabilities of data lakes and data weaving to achieve unified collection, modeling, storage, and sharing of metadata through the following steps.
[0082] 1) Extract relevant instances, entities and relationships from multi-source heterogeneous data based on semantics to achieve data assimilation based on RDF (Resource Description Framework).
[0083] 2) Construct an industry data catalog and an industry data coding system zyxID. Based on industrial mechanisms, the data is summarized and mapped to nodes in the industry lexigraph iLexiGraph, and the semantics of the data are enriched by the association between the lexigraph nodes.
[0084] 3) Based on the complete metadata provided by Lakehouse, the metadata is aggregated and integrated in iDS3, and managed and shared.
[0085] 4) Combining the table information in the data lake with the semantic relationships of data assets established by iDS3, the change trajectories of the data platform and external systems are correlated.
[0086] VI. Metadata-based DIKube Management
[0087] Based on industrial mechanisms, a metadata-based DIKube is pre-generated. iDS3 uses ANSI SQL for data retrieval, avoiding the drawbacks of users needing to master various complex SQL statements.
[0088] VII. Integrated Data Governance
[0089] Integrated data governance refers to connecting the data lake and the application layer through iDS3 to manage metadata and data. The implementation steps are as follows:
[0090] 1) Ensure data quality of the data platform based on the data lake, and use data weaving to connect the data platform with external systems.
[0091] 2) Relying on iDS3's data security and lifecycle management mechanism, manage the data access and read / write processes in the data lake.
[0092] 3) Analyze and mine the historical access and change information of the data lake through audit logs and compliance management.
[0093] VIII. Data Security and Auditing
[0094] Data security and auditing refers to monitoring the data source and quality of critical ETL tasks in a data lake from three aspects: data writing, data consistency, and data source orchestration and scheduling, through audit logs and data quality checks. The implementation method is as follows:
[0095] 1) Use iDS3 to extract data from the data source, and use the data lake to write the extracted data into the data management platform.
[0096] 2) Based on the transactional and data quality mechanisms of the data lake, simplify the data consistency issues related to iDS3 in ETL / ELT (Extract LoadTransform).
[0097] 3) Based on iDS3, implement the orchestration and scheduling of distributed ETL / ELT tasks, and load data from the source to the data lake.
[0098] IX. Building Data Applications
[0099] The steps to build a data application are as follows:
[0100] 1) Use data lake metadata as the data source in iDS3.
[0101] 2) Use iDS3 in the data management platform to simplify the data analysis process and steps.
[0102] 3) Utilize the ACID transaction mechanism of the data lake to enable iDS to provide high-quality analytical data sources.
[0103] 4) Add a trusted data transaction mechanism to iDS3 and implement the iDS3 connector for each organization.
[0104] 5) All organizations exchange information in accordance with the methods specified in the Data Space Rules Manual.
[0105] 6) Provide users with industry-oriented, scenario-driven, on-demand, real-time data application services and interactive massive dataset analysis and display.
[0106] The data security and auditing in this application ensures data quality in iDS3 from three aspects: data writing, data consistency, and data source orchestration and scheduling, thus solving the shortcomings of data lakes as data management platforms in guaranteeing data quality.
[0107] The trusted data transaction mechanism and metadata enhancement management in this application can analyze and control the historical access and changes of the data lake, solving the problem that the lake warehouse integration as a data management platform lacks the ability to monitor the complete lifecycle of data.
[0108] This application, based on RDF data assimilation and iLexiGraph built on industrial mechanisms, can solve the problems of data filling when data is ready to be used, support various queries, build indexes based on IDs and quickly discover the relationships between different data sources, thus solving the shortcomings of traditional data spaces in three aspects: data modeling and querying, local storage and indexing, and data space discovery.
[0109] The industry data catalog, industry data identification and coding system, and iLexiGraph based on industrial mechanisms in this application enable source data applications that are industry-oriented, scenario-driven, on-demand, real-time, and readily usable. This solves the problem that native data weaving cannot achieve real-time, on-demand data fusion and integration due to the lack of an architecture that supports it.
[0110] The industrial data tensor DIKube described in this application provides on-demand data fusion and integration. This is because DIKube only stores metadata and its corresponding unique ID. The ID is encoded based on industrial mechanisms, with different applications using different encoding rules. DIKube's data storage method solves the problem that native data weaving cannot support a wide variety of data applications.
[0111] See Figure 1 , Figure 1 This invention describes the industrial data management platform of the present invention. The platform has a five-layer architecture with a U-shaped feature. From bottom to top, these are the data layer, data ingestion & storage layer, metadata management layer, DIKube layer, and data application layer. The data layer represents multi-source heterogeneous data sources distributed in different locations. The data ingestion & storage layer uses stream-batch processing and ETL technologies to ingest data from these multi-source heterogeneous data sources and uses a data lake to write the ingested data into the data management platform. The metadata management layer functions to achieve RDF-based data assimilation, construct the iLexiGraph (a lexicon built based on industrial mechanisms using the industry data identification and coding system zyxID), create indexes using zyxID, and store metadata. The DIKube layer includes a DIKube management module and a data analysis module. The DIKube management module can search and query metadata based on the indexes created in the metadata management layer, and can generate DIKube pages on different topics according to user needs, enabling the development of an industrial search engine and data exploration and analysis. The data analysis module receives business requirements from the data application layer and uses DIKube management to complete machine learning-based data analysis and artificial intelligence-based generative user application scenarios. The data application layer's function is to query relevant metadata from DIKube based on application scenarios and user needs, and then link to the data sources required by different applications based on the metadata to complete tasks such as interactive massive data analysis and display, data application services, and trusted data transactions. The two sides of the U-shaped structure are data security & auditing and data governance. Data security & auditing ensures the quality of data within the platform. Data governance adopts a decentralized, distributed data co-existence and sharing concept, utilizing technologies such as zyxID (which includes data lineage), industrial mechanisms, and the CGF (Content Generic Data Framework) to solve the problems of difficulty in effectively utilizing internal data within data enterprises, difficulty in sharing data across the industry chain, and difficulty in obtaining valuable open data on the internet.
[0112] Figure 2 The multi-functional data service appliance described herein has been used by numerous heavy industrial enterprises and their subsidiaries in China. This appliance can comprehensively integrate multi-dimensional data from various dimensions, including enterprise, subsidiary, workshop, production line, and equipment data, across heavy industrial enterprises.
[0113] Figure 3 This invention describes the implementation steps for generative application scenarios based on artificial intelligence. Based on these steps, the invention can automatically and in real-time push high-quality data to users with application needs, while meeting compliance requirements regarding data permissions and privacy.
[0114] In this document, the terms "upper," "lower," "front," "back," "left," "right," "top," "bottom," "inner," "outer," "vertical," and "horizontal," etc., indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only used for the clarity of expressing the technical solution and for the convenience of description, and therefore should not be construed as limiting the present invention.
[0115] In this document, the terms “comprising,” “including,” or any other variations thereof are intended to cover non-exclusive inclusion, which includes not only the elements listed but also other elements not expressly listed.
[0116] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely illustrative of the principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the present invention as claimed. The scope of protection of this invention is defined by the appended claims and their equivalents.
Claims
1. An artificial intelligence-based industry data management and sharing platform, characterized in that, This includes the following steps: 1) Build a data management platform based on the data lake; 2) Based on the data management platform, create a decentralized, distributed, symbiotic, and shared industrial data space iDS3 based on a data weaving architecture; 3) Ingest data from the data source; Import data from various databases, message queues, and file storage into the industrial data space iDS3 for unified data management; 4) Add a trusted data transaction mechanism to the industrial data space iDS3; 5) Enhanced metadata management; combining the capabilities of data lake and data weaving to achieve unified collection, modeling, storage, and sharing of metadata; The steps for metadata enhancement management are as follows: 5.1) Extract relevant instances, entities, and relationships from multi-source heterogeneous data based on semantics to achieve RDF-based data assimilation; 5.2) Construct an industry data catalog and an industry data coding system zyxID. Based on industrial mechanisms, the data is summarized and mapped to nodes in the industrial lexigraph iLexiGraph, and the semantics of the data are enriched by the association between the lexigraph nodes. 5.3) Based on the complete metadata provided by Lakehouse, the metadata is aggregated, integrated, managed, and shared in iDS3; 5.4) Combining the table information in the data lake with the semantic relationships of data assets established by iDS3, the change trajectories of the data platform and external systems are correlated; 6) Metadata-based DIKube management; pre-generating metadata-based DIKube based on industry mechanisms; 7) Integrated data governance; iDS3 connects the data lake to the application layer, enabling the management of metadata and data; 8) Data security and auditing; through audit logs and data quality checks, monitor the data source and data quality of key ETL tasks in the data lake from three aspects: data writing, data consistency, and data source orchestration and scheduling; 9) Build data applications; The data management platform has a five-layer architecture with a U-shaped feature, consisting of, from bottom to top, the data layer, the data ingestion and storage layer, the metadata management layer, the DIKube layer, and the data application layer; The data layer represents multi-source heterogeneous data sources distributed in different locations; The data ingestion and storage layer uses stream-batch processing technology and ETL technology to ingest data from multiple heterogeneous data sources, and uses a data lake to write the ingested data into the data management platform. The metadata management layer functions to achieve RDF-based data assimilation, construct the iLexiGraph leximetric based on industrial mechanisms using the industry data identification and coding system zyxID, and create indexes and store metadata using zyxID. The DIKube layer includes a DIKube management module and a data analysis module. The DIKube management module enables metadata search and query based on the indexes created in the metadata management layer, and can generate DIKubes on different topics according to user needs, realizing the development of industrial search engines and data exploration and analysis. The data analysis module receives business requirements from the data application layer and uses DIKube management to complete machine learning-based data analysis and artificial intelligence-based generative user application scenarios. The data application layer's function is to query the relevant metadata from DIKube based on the application scenario and user needs, and then link to the data sources required by different applications based on the metadata to complete interactive massive data analysis and display, data application services, and trusted data transaction tasks. 2.The artificial intelligence-based industry data management and sharing platform of claim 1, wherein, The steps for building a data management platform based on a data lake are as follows: 1) Construct a distributed big data cluster environment; 2) Install the cluster visualization management plugin and environment deployment dependency plugins; 3) Deploy the storage data volume management component, the distributed object storage component MinIO, component services, the data query component Trino, and the BI tool Superset. 3.The artificial intelligence-based industry data management and sharing platform of claim 1, wherein, The industrial data space iDS3 is used for ubiquitous storage of metadata, and it adopts the DIKube data storage method; each data tensor in DIKube stores only metadata and its corresponding unique ID.
4. The artificial intelligence-based industrial data management and sharing platform according to claim 1, characterized in that, The trusted data transaction mechanism includes the following three types: 4.1) A trustworthy mechanism for data transactions is established through unique IDs and blockchain technology; 4.2) Trace the operation history of data through audit records; 4.3) Backtrack to a specified historical version; based on fully recording data characteristics, generate an identifier ID from the extracted metadata and tags, and store it in the industrial data tensor iDIKube to establish a ubiquitous data governance mechanism.
5. The artificial intelligence-based industrial data management and sharing platform according to claim 1, characterized in that, The steps for integrated data governance are as follows: 7.1) Ensure data quality of the data platform based on the data lake, and use data weaving to connect the data platform with external systems; 7.2) Relying on iDS3's data security and lifecycle management mechanism, manage the data access and read / write processes in the data lake; 7.3) Analyze and mine the historical access and change information of the data lake through audit logs and compliance management.
6. The artificial intelligence-based industrial data management and sharing platform according to claim 1, characterized in that, The implementation methods for data security and auditing are as follows: 8.1) Use iDS3 to extract data from the data source, and use the data lake to write the extracted data into the data management platform; 8.2) Based on the transactional and data quality mechanisms of the data lake, simplify the data consistency of iDS3 in ETL / ELT; 8.3) Implement the orchestration and scheduling of distributed ETL / ELT tasks based on iDS3, and load data from the source to the data lake.
7. The artificial intelligence-based industrial data management and sharing platform according to claim 1, characterized in that, The steps for building the data application are as follows: 9.1) Use data lake metadata as the data source in iDS3; 9.2) Use iDS3 in the constructed data management platform to simplify data analysis processes and steps; 9.3) Utilize the ACID transaction mechanism of the data lake to enable iDS to provide high-quality analytical data sources; 9.4) Add a trusted data transaction mechanism to iDS3 and implement the iDS3 connector for each organization; 9.5) All organizations exchange information in accordance with the methods specified in the Data Space Rules Manual; 9.6) Provide users with industry-oriented, scenario-driven, on-demand, real-time data application services and interactive massive dataset analysis and display.