A data cleaning method and a device for data cleaning

A data cleaning and data technology, applied in the field of data processing, can solve problems such as low execution efficiency, large amount of transmitted data, and high pressure of data processing, and achieve the effects of improving efficiency, saving time, and reducing the pressure of data cleaning

Active Publication Date: 2019-01-04
WUHAN DAMENG DATABASE
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above defects or improvement needs of the prior art, the present invention provides a data cleaning method and a device for data cleaning, the purpose of which is to perform data cleaning by the client, send the cleaned data to the server, and alleviate The data processing pressure on the server is reduced; at the same time, the client can determine the compensation cleaning strategy according to the correlation between the cleaning strategies sent by the server, and clean the cleaned data again according to the compensation cleaning strategy, saving data cleaning time. Improve the efficiency of data cleaning, thereby solving the current technical problems that the server cleans and integrates the source data after obtaining the source data from the client, resulting in a large amount of transmitted data, high data processing pressure, and low execution efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data cleaning method and a device for data cleaning
  • A data cleaning method and a device for data cleaning
  • A data cleaning method and a device for data cleaning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] At present, after the data center server obtains the source data from the client, it cleans and integrates the source data to obtain the target data. Since there may be a large amount of invalid or data information that needs to be cleaned in the source data, this will result in a large amount of transmitted data. Especially when the data center server is connected to multiple clients, it will cause the problems of high execution pressure and low execution efficiency of the data center server. In order to solve this problem, the present invention provides a data cleaning method, which moves data cleaning integration from the server to the client, fully utilizes the computing resources of the client, reduces the network transmission volume, reduces the processing pressure of the server, and expands The overall throughput of the data cleaning and integration system is improved, and the overall operating efficiency of the system is improved. On the other hand, the data cl...

Embodiment 2

[0096] figure 1 The method of data cleaning shown is explained from the client side, see below figure 2 , explaining the method for data cleaning of the present invention from the server side, the method for data cleaning specifically includes the following steps:

[0097] Step 20: The server sends the first cleaning strategy to the client, and receives the first cleaning data and / or the summary of the first cleaning data after the client cleans the source data according to the first cleaning strategy.

[0098] In this embodiment, the server sends the first cleaning strategy to the client, wherein the first cleaning strategy includes a preset cleaning rate threshold to control the data cleaning time. Wherein, the preset cleaning rate threshold is set by the server according to actual conditions, and is not specifically limited here.

[0099] The client cleans the source data according to the first cleaning strategy to obtain the first cleaning data and / or the summary of the...

Embodiment 3

[0112] see image 3 , image 3 It is a schematic structural diagram of a device for data cleaning provided by an embodiment of the present invention. The device for data cleaning in this embodiment includes one or more processors 31 and memory 32 . in, image 3 A processor 31 is taken as an example.

[0113] Processor 31 and memory 32 can be connected by bus or other means, image 3 Take connection via bus as an example.

[0114]The memory 32, as a non-volatile computer-readable storage medium based on data cleaning, can be used to store non-volatile software programs, non-volatile computer-executable programs and modules, such as Embodiment 1 and / or Embodiment 2 The method of data cleaning in and the corresponding program instructions. The processor 31 executes various functional applications and data processing of the data cleaning method by running the non-volatile software programs, instructions and modules stored in the memory 32, so as to realize the data cleaning ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data cleaning method and a device for data cleaning. The data cleaning method comprises: cleaning source data according to a first cleaning strategy to obtain a first cleaning data and / or a first cleaning data digest, and sending the first cleaning data and / or the first cleaning data digest to a server; the receiving server adjusting the updated second cleaning strategy according to the first cleaning data and / or the first cleaning data digest; processing the first cleaning data and / or the source data corresponding to the first cleaning data according to the compensated cleaning strategy to obtain the first processing data; the uncleaned data being cleaned according to a second cleaning strategy to obtain a second cleaning data and / or a second cleaning data digest. The data cleaning method is carried out by the client, and the data after cleaning is sent to the server, which reduces the data processing pressure of the server and improves the efficiency of datacleaning.

Description

technical field [0001] The invention belongs to the technical field of data processing, and more specifically relates to a data cleaning method and a data cleaning device. Background technique [0002] With the wide application of enterprise information systems, information systems have become the key to maintaining business operations. The diversified business types of enterprises have led to increasingly complex data access requirements. At the same time, the sharp increase in data volume has also caused the database server to be overwhelmed. [0003] Therefore, it is necessary to establish a data center to improve the availability of information systems and the efficiency of access queries. However, due to the differences in the construction of information systems, it is often necessary to clean and integrate the source data from various information systems in the process of establishing a data center. At present, after the data center server obtains the source data from...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215
Inventor 张勇高东升付铨梅纲
Owner WUHAN DAMENG DATABASE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products