Task breakpoint resuming method applied to data cleaning tool

A technology of data cleaning and resuming transmission from breakpoints, applied in the field of data processing, can solve the problems of unstable operation process and slow business data cleaning speed, so as to achieve the effect of more user-friendly operation, avoid repeated execution and improve speed.

Pending Publication Date: 2020-03-27
WUXI SHILING TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Aiming at the shortcomings of slow business data cleaning speed and unstable operation process in the p

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Task breakpoint resuming method applied to data cleaning tool
  • Task breakpoint resuming method applied to data cleaning tool

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0028] Example 1

[0029] Step 1: Extract third-party business source data, split the source data breakpoints, group and mark;

[0030] Step 2: Query and compare the "source data breakpoint group mark storage table". If there is no data, insert the breakpoint mark, if there is data, find out the group number of the unprocessed breakpoint mark;

[0031] Step 3: Obtain the corresponding data source block according to the group number marked by the unprocessed breakpoint, execute the unprocessed data source block sequentially, clean, convert, and load the target library table;

[0032] Step 4: After the cleaning of a group of source data chunks is completed, change the breakpoint mark status in the mark storage table to processed until all tasks are executed, delete the breakpoint mark storage table in the "source data breakpoint group mark storage table" The breakpoint mark information of the task.

[0033] Take the cleaning of 10,000 records of patient admission registration information...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data processing, and discloses a task breakpoint resuming method applied to a data cleaning tool, which comprises the following steps of: (1) extractingtarget source data, and splitting a source data breakpoint into data source chunks; (2) after an abnormal problem occurs in a processing task, when the task needs to be restarted, performing query according to the source data breakpoint grouping mark table, and positioning a nearest mark breakpoint in an unprocessed state; (3) acquiring corresponding data source chunks according to the group numbers marked by the unprocessed breakpoints, executing the unprocessed data source chunks in sequence, and continuing to complete the cleaning task; and (4) when all arrays in the source data breakpointgrouping mark table are in a processed state, completing task execution. According to the method and the system, the cleaning data is segmented in a breakpoint splitting, grouping and marking mannerof the source service data, and after the cleaning task is abnormally interrupted, the remaining tasks can still be continuously completed from the abnormally interrupted points after breakpoint comparison.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a method for resumed transmission of tasks applied to data cleaning tools. Background technique [0002] At present, with the development of medical informatization, the construction of hospital information integration platform has been widely carried out, and the ETL data cleaning tools included in it are mainly used to build a hospital-wide data center and realize an independent data warehouse. ETL is the process of data extraction (Extract), transformation (Transform), and loading (Load). It is the core and soul of BI / DW (Business Intelligence / Data Warehouse), and an important part of building a data center. Users extract the required data from the data source, after data conversion, and finally load the data into the data center according to the pre-defined data center model. The business data of the medical business system HIS, LIS, PACS, EMR and other syste...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/215G16H40/00
CPCG06F16/215G16H40/00
Inventor 纪峥嵘刘军叶庆楚陈博文吴永佳
Owner WUXI SHILING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products