Unlock instant, AI-driven research and patent intelligence for your innovation.

Data processing method and device, server and medium

A data processing and processor technology, applied in the field of devices, servers and media, and data processing methods, can solve the problems of reducing the workload of manual data analysis, mapping strategies that cannot be reused, and cannot provide automatic mapping strategies, so as to reduce manpower. Error probability effect, effect of reducing manual workload

Pending Publication Date: 2021-01-29
RUN TECH CO LTD BEIJING
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the field of big data governance, the source data sets can be as many as thousands or even tens of thousands. In different project implementations, data governance tools can be reused, but mapping strategies cannot be reused. The method of full manual analysis is time-consuming, laborious, and easy human error
Although many data governance products can provide visualization tools to improve the efficiency of the manual mapping strategy formulation process, none of them can provide automatic mapping strategy generation or mapping strategy recommendation to reduce the workload of manual data analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and device, server and medium
  • Data processing method and device, server and medium
  • Data processing method and device, server and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] figure 1 It is a flow chart of data processing provided by Embodiment 1 of the present invention. This embodiment is applicable to the situation of generating a mapping strategy between a source dataset schema and a target dataset schema. This method can be executed by a data processing device. Specifically include the following steps:

[0028] S110. Acquire the field names of the source dataset schema and the field names of the target dataset schema.

[0029] Generally, the dataset mode includes: relational database mode, non-relational database mode or graph database mode based on knowledge graph. A dataset's schema, on the other hand, consists of field names. Obtain the field names of the source dataset schema and the field names of the target dataset schema respectively, which are used to obtain the mapping strategy for converting the source dataset schema to the target dataset schema.

[0030] S120. Combine each field name of the source dataset schema with all f...

Embodiment 2

[0044] Figure 4 A flow chart of data processing provided by Embodiment 2 of the present invention. This embodiment is further optimized on the basis of the previous embodiment. The data processing method further includes: displaying the mapping strategy, and accepting the The correctness judgment result of the mapping strategy; the determined correct mapping strategy is used to continue training the mapping strategy generation model, so that the mapping strategy generation model can be continuously optimized, and the obtained mapping strategy is more accurate.

[0045] like Figure 4 As shown, it specifically includes the following steps:

[0046] S210. Obtain each field name of the source dataset schema and each field name of the target dataset schema.

[0047] S220. Combine each field name of the source dataset schema with all field names of the target dataset schema to obtain a field name combination.

[0048] S230. Perform vectorization processing on all the field name...

Embodiment 3

[0053] Figure 5 A structure diagram of a data processing device provided by Embodiment 3 of the present invention, the data processing device includes: a field name acquisition module 310 , a field name combination module 320 and a mapping strategy acquisition module 330 .

[0054] Among them, the field name acquisition module 310 is used to obtain each field name of the source dataset mode and each field name of the target dataset mode; the field name combination module 320 is used to combine each field name of the source dataset mode with the All field names of the target data set pattern are combined to obtain a field name combination; the mapping strategy acquisition module 330 is used to vectorize all field names in the field name combination, and input the vector combination of field names to the trained The mapping strategy generation model obtains the mapping strategy for mapping the schema of the source dataset to the schema of the target dataset.

[0055] In the te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses a data processing method and device, a server and a medium. The method comprises the steps of obtaining each field name of a source data set mode and each field name of a target data set mode; combining each field name of the source data set mode with all field names of the target data set mode to obtain a field name combination; performing vectorization processing on all field names in the field name combination, and inputting a vector combination of the field names into a trained mapping strategy generation model to obtain a mapping strategy of mapping the source data set mode to the target data set mode. According to the technical scheme of the embodiment of the invention, the problem that the establishment of a mapping strategy between the source data set mode and the target data set mode needs to consume a large amount of labor is solved, the mapping strategy between the source data set mode and the target data set mode is automatically provided, and the effects of effectively reducing the manual workload and the human error probability are achieved.

Description

technical field [0001] Embodiments of the present invention relate to the technical fields of data governance and artificial intelligence, and in particular, to a data processing method, device, server, and medium. Background technique [0002] In the data governance platform, the raw data extracted from various business systems have different data patterns. In the process of building data warehouses, knowledge graphs, and other data applications, it is necessary to perform operations such as merging, splitting, extracting, merging, or converting these data sets from different sources, so as to map the source data to the designed target data schema. However, data from multiple source systems mapped to the same destination data schema may have different data schemas and thus different mapping strategies. Each source dataset requires manual analysis by data engineers based on the business meaning and field semantics of the data schema in order to determine the correct mapping...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/25
CPCG06F16/254G06F16/258Y02D10/00
Inventor 由磊张俊杰李新鹏李贺毛勇岗
Owner RUN TECH CO LTD BEIJING