Unlock instant, AI-driven research and patent intelligence for your innovation.

Data extraction method, device, server and storage medium for distributed system

A distributed system and data extraction technology, applied in the field of data extraction of distributed systems, can solve the problems of long processing time, inability to complete a large amount of incremental data extraction, and inability to complete full data extraction, etc., to achieve the effect of reducing time consumption

Active Publication Date: 2022-02-18
RUN TECH CO LTD BEIJING
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The existing technology uses full data extraction. When the amount of data is large, it is impossible to complete a large amount of incremental data extraction within a timing cycle. The method of storing the data in full first and then timing batch processing to extract the associated relationship in full requires a long processing time. It may happen that the full amount of data extraction cannot be completed within a timing cycle, and an incremental processing method is required to solve the problem of large amounts of data extraction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data extraction method, device, server and storage medium for distributed system
  • Data extraction method, device, server and storage medium for distributed system
  • Data extraction method, device, server and storage medium for distributed system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0031] figure 1 It is a flow chart of a method for extracting data in a distributed system provided by Embodiment 1 of the present invention. This embodiment is applicable to the case of merging new data and historical data.

[0032] Specifically include the following steps:

[0033] S101. Extract first data and a first data relationship from newly added data, where the first data relationship is a first association relationship between the first data.

[0034] The association relationship in this embodiment is the interdependence and influence relationship of data. Exemplarily, for example, there are three data tables in an existing school: student (student number, name), course (course name, course number), course selection (student number , course number, grade), the "student number" and "course number" in the course selection table must correspond to the student's student number, name, and course name and number in the course. When the student's name is deleted or the cou...

Embodiment 2

[0045] Such as figure 2 As shown, this embodiment provides a data extraction method for a distributed system. On the basis of the above embodiments, specific steps for matching new data and historical data are added, as follows:

[0046] S201. Extract first data and a first data relationship from newly added data, where the first data relationship is a first association relationship between the first data;

[0047]S202. Acquire historical data, where the historical data includes second data and a second data relationship, where the second data relationship is a second association relationship between the second data;

[0048] S2031. Compare the first data with the second data in sequence, and determine whether each of the first data and the second data is repeated;

[0049] S2032. If repeated, delete the first data, and save the second data as the third data;

[0050] S2033. If not repeated, merge the first data and the second data into the third data;

[0051] The third d...

Embodiment 3

[0061] Such as image 3 As shown, this embodiment provides a data extraction method for a distributed system, matching the first data with the second data in the above embodiment, and matching the first data relationship with the second data relationship to generate a matching result It has been refined and realized by drawing a relationship diagram. The specific steps are as follows:

[0062] S301. Extract first data and a first data relationship from newly added data, where the first data relationship is a first association relationship between the first data.

[0063] S302. Acquire historical data, where the historical data includes second data and a second data relationship, where the second data relationship is a second association relationship between the second data.

[0064] S3031. Use the first data as a first node, and use the first data relationship as a first connection line.

[0065] S3032. Compose the first node and the first connecting line into a first relati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data extraction method of a distributed system, comprising: extracting first data and a first data relationship from newly added data, the first data relationship being the first association relationship between the first data; acquiring historical data , the historical data includes the second data and the second data relationship, the second data relationship is the second association relationship between the second data; the first data and the second data are matched, and the The first data relationship is matched with the second data relationship to generate a matching result; the newly added data and the historical data are combined according to the matching result to store the invention name. The data extraction device, server and storage medium of the distributed system are also disclosed. In the present invention, by merging newly added data and historical data, the data relationship of incremental data can be extracted more quickly.

Description

technical field [0001] Embodiments of the present invention relate to data extraction technology, and in particular, to a data extraction method, device, server, and storage medium for a distributed system. Background technique [0002] With the development of computer technology, the amount of data that the system needs to process has been increasing, and the correlation between the data that needs to be extracted is also increasing. [0003] The existing technology uses full data extraction. When the amount of data is large, it is impossible to complete a large amount of incremental data extraction within a timing cycle. The method of storing the data in full first and then timing batch processing to extract the associated relationship in full requires a long processing time. It may happen that the full amount of data extraction cannot be completed within a regular period, and an incremental processing method is required to solve the problem of large amount of data extract...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/182G06F16/172G06F16/16
CPCG06F16/182G06F16/172G06F16/16
Inventor 张超刘涛张志远万月亮
Owner RUN TECH CO LTD BEIJING
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More