Unlock instant, AI-driven research and patent intelligence for your innovation.

Data center data cleaning system based on data fusion

A data cleaning and data center technology, which is applied in electrical digital data processing, special data processing applications, digital data information retrieval, etc. And other issues

Pending Publication Date: 2021-07-27
CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] (1) The situation of emphasizing construction over applications and emphasizing hardware over software still needs to be improved;
[0005] (2) Fragmentation and isolated information islands are still serious;
[0006] (3) The degree of information sharing and business collaboration needs to be improved urgently;
[0007] (4) The situation of network and information security is not optimistic;
Since the data cleaning process and the data collection process are combined to form an assembly line, when an error occurs in the data cleaning process, the data collection task will also fail at the same time, and the data cleaning process will block the data collection process, affecting Collection efficiency; usually data cleaning takes a long time. If the network connection between the collector and the data center is occupied for a long time, other collectors will queue up to wait for the resources to be free, which limits the number of concurrent connections in the data center to a certain extent. , which is not conducive to data collection of large-scale data sources
[0028] (3) The visual programming function of data cleaning is single, which cannot meet the complex data cleaning needs
[0029] Since the data fusion platform needs to set data cleaning rules at the same time when setting the collection task, it needs to provide a visual programming method so that users can easily configure data cleaning rules. Due to the unpredictability of data, it is necessary to cover as comprehensive cleaning as possible Therefore, the current data fusion platform only provides simple cleaning rules, which cannot meet the needs of complex data cleaning. The existing professional visual ETL tools cannot be directly connected, and can only be used as additional auxiliary modules. Integrate with the system to form a complete automation system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data center data cleaning system based on data fusion
  • Data center data cleaning system based on data fusion
  • Data center data cleaning system based on data fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0081]With the development and application of big data, more and more technologies and products related to big data are emerging. The most basic and critical one is data collection. Because traditional application systems are isolated from each other, information islands are formed. , and the data structures are different, only through data collection, the data of various systems are gathered together to form a data resource pool and a data mart, in order to realize the analysis and application of big data, as well as the management, sharing and exchange of data resources . In order to realize the complete process of data extraction, transmission, cleaning, storage, data management, and then to data services, a corresponding data fusion platform has been produced. The data fusion platform is based on commercial applications, taking into account the complexity of various actual environments, in order to adapt to Various complex external environments have formed a certain main f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a data center data cleaning system based on data fusion. The system comprises a user operation management module, a data fusion module, a middleware module, a cleaning center module, a kettle cluster module and a data storage module. Data cleaning by the system is centralized and unified, and the problem of non-uniform decentralized versions is effectively avoided. Meanwhile, data collection is not blocked, the collection efficiency is greatly improved, the pressure of a central system is relieved, professional cleaning tools can be fully utilized, the data cleaning development cost is reduced, a distributed deployment mode is adopted, distributed cleaning processing is carried out, and the performance is effectively guaranteed.

Description

technical field [0001] The invention relates to a data cleaning system of a data center based on data fusion. Background technique [0002] The following takes the government affairs industry as an example to introduce the relevant technical background of the data fusion platform. [0003] Current status of government informatization: [0004] (1) The situation of emphasizing construction over applications and emphasizing hardware over software still needs to be improved; [0005] (2) Fragmentation and isolated information islands are still serious; [0006] (3) The degree of information sharing and business collaboration needs to be improved urgently; [0007] (4) The situation of network and information security is not optimistic; [0008] Open data sharing requirements: [0009] At the level of laws and regulations: "Information islands" and data barriers affect the level of data openness and sharing; hinder the improvement of government public service capabilities; ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F16/2458
CPCG06F16/215G06F16/2471
Inventor 张家健万修远王佳晓朱晨鸣周斌李元义
Owner CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD