Supercharge Your Innovation With Domain-Expert AI Agents!

Data processing method

A data processing and data cleaning technology, applied in the field of data processing, to improve the operation efficiency and detection accuracy, reduce the number of field matching, and solve the problem of similar duplicate records detection.

Inactive Publication Date: 2017-10-27
福建中经汇通有限责任公司
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Can effectively solve the problem of similar duplicate record detection with large amounts of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method
  • Data processing method
  • Data processing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to understand the above-mentioned purpose, features and advantages of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

[0038] In the following description, many specific details are set forth in order to fully understand the present invention. However, the present invention can also be implemented in other ways than described here. Therefore, the protection scope of the present invention is not limited by the specific implementation disclosed below. Example limitations.

[0039] The cleaning system designed by the present invention has an algorithm library and a rule library. These algorithm libraries and rule libraries are open, and contain a large number of rich cleaning alg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data processing method. A technical scheme of the invention shows that data sets are sorted and grouped by utilizing zone bit codes of key field characters according to the method, so that the operation efficiency and detection precision of the method are improved; and through selecting the representative characters in the groups and deleting the unrelated characters, the character matching frequency during similar repeated record detection is decreased, so that the operation time of record matching is shortened, the similar repeated record detection problems of large data volumes can be effectively solved, and an effect of cleaning useless data of data sources is provided.

Description

technical field [0001] The present invention relates to the field of data processing, and more specifically, to a data cleaning and processing method. Background technique [0002] With the growing maturity of the research in the field of data mining, people's requirements for data quality are getting higher and higher. However, because there are a lot of redundant or missing data in the data warehouse, as well as inconsistent or uncertain data, the quality of the data is reduced. We call these data that affect the quality of the data "dirty data". According to the principle of "garbage in, garbage out", dirty data will affect the quality of data mining, make the decision-making analysis system get wrong analysis results, and ultimately mislead the decision-making, affecting the accuracy of the decision-maker's prediction and decision-making. In addition, dirty data can lead to expensive operations and long response times. Therefore, we must clean the dirty data. A large ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/283G06F16/215
Inventor 郝波柯炯亮
Owner 福建中经汇通有限责任公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More