Parallel data cleaning system

A data cleaning and data technology, applied in the field of data cleaning, can solve problems such as difficulties and information integration difficulties, and achieve the effect of meeting business needs, improving cleaning efficiency, and improving processing capacity

Inactive Publication Date: 2018-06-12
GEOVIS CO LTD
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the development of the Internet today, the data scale of enterprises in major information fields has expanded rapidly, some even reaching the PB level. More and more business data produced by users and machines have brought greater challenges to IT systems. Data storage And it has become difficult to access and use these data in the future, and it has become even more difficult to obtain useful information from massive data
In addition, since different business data express information in different ways, it brings certain difficulties to information integration

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel data cleaning system
  • Parallel data cleaning system
  • Parallel data cleaning system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments, wherein the schematic embodiments and descriptions are only used to explain the present invention, but are not intended to limit the present invention.

[0021] See attached figure 1 , is a parallel data cleaning system applied in the present invention, the system includes multiple data sources, multiple data collection devices, cache server, data cleaning device, database, business machine, wherein each data collection device is used for multiple The data source collects data and transmits the collected data set to the cache server. The data cleaning device obtains the data to be processed from the cache server. The data cleaning device saves the cleaned data into the database. Obtain the desired cleaned data;

[0022] The data cleaning device includes a control device, a local cache, and multiple processing devices. Multiple processing devic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a parallel data cleaning system. The system comprises a multi-data source, a plurality of data acquisition devices, a cache server, a data cleaning device, a database and a business machine. The system comprises the data cleaning device with a parallel processing ability; and the data cleaning device comprises a control device, a local cache and a plurality of processing devices. According to the system, parallel data cleaning can be carried out, so that the cleaning speed is improved and a mass data cleaning ability is provided; and through improving the processing ability of the cleaning device, the cleaning efficiency can be further enhanced, and a plurality of business machines and / or a plurality of users can be served at the same time, so that different business requirements of different users can be satisfied.

Description

【Technical field】 [0001] The invention belongs to the field of data cleaning, in particular to a parallel data cleaning system. 【Background technique】 [0002] With the development of the Internet today, the data scale of enterprises in major information fields has expanded rapidly, some even reaching the PB level. More and more business data produced by users and machines have brought greater challenges to IT systems. Data storage And it has become difficult to access and use these data in the future, and it has become even more difficult to obtain useful information from massive amounts of data. In addition, since different business data express information in different ways, it brings certain difficulties to information integration. In addition, the same piece of business data may appear in different forms, which will generate a lot of duplicate data. Therefore, it is very necessary to clean the duplicate data for data from different data sources with different forms of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/215G06F16/24552
Inventor 安西民林殷徐凤桐
Owner GEOVIS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products