Unlock instant, AI-driven research and patent intelligence for your innovation.

A mass data quality checking method and system thereof

A mass data and data quality technology, applied in the computer field, can solve the problems of being unable to view the details of the inspection results, unable to handle the data quality inspection tasks of massive data, unable to statistically analyze and retrieve the results of massive data inspections, etc.

Active Publication Date: 2021-08-03
睿至科技集团有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The current data quality inspection is realized by defining the inspection method and creating the inspection task. According to the definition content of the inspection method, all task scheduling is uniformly scheduled by the scheduling center, and the data quality inspection center executes the inspection task uniformly. This method is easy to manage and implement, but when the amount of data reaches T level, the system cannot continue to execute the inspection task, that is, it cannot handle the data quality inspection task of massive data, nor can it view the inspection result details, and cannot perform massive data quality inspection tasks. Real-time statistical analysis and retrieval of data check results
[0005] In addition, in the power data center, massive data poses a great challenge to the overall data quality inspection work. In the case of massive data, the inspection execution time is more than 30 minutes, and some even take 4 or 5 hours. When the number of parallel check tasks exceeds 2000, the check tasks cannot be continued
The details of the inspection results are stored in the middle platform or on traditional databases such as mysql, Oracle, and sqlserver, and the massive result sets cannot perform real-time query and retrieval of the detailed data of the inspection results
It can be seen that the existing implementation methods are difficult to support the execution of quality inspection tasks for massive data, and the detailed analysis and retrieval of data inspection results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A mass data quality checking method and system thereof
  • A mass data quality checking method and system thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The technical solutions in the embodiments of the present invention are clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the present invention.

[0024] Such as figure 1 As shown, the present application provides a mass data quality inspection system, including: a power data center 110 , a container cloud 120 , and a server cluster 130 ; the server cluster 130 includes multiple servers 1301 .

[0025] Power data center 110: used to store massive data to be checked, and allow server clusters to access massive data.

[0026] Container cloud 120: used to select an execution server from the server cluster.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present application discloses a massive data quality inspection method and system thereof, wherein the massive data quality inspection system includes: a power data middle platform, a container cloud, and a server cluster; the server cluster includes multiple servers; the power data middle platform is used to Store massive data to be checked and allow server clusters to access; container cloud is used to select execution servers from server clusters; server clusters are used to independently deploy basic information, determine data check information and upload it; check information based on data Access to massive data, determine the data block of the massive data to be checked and the parallel number of data checking tasks; determine the execution server that needs to be scheduled according to the parallel number; the execution server processes the data block and generates the checking result ; Store the checking result. The application has the technical effect of supporting the execution of scheduling unlimited data quality inspection tasks, and supporting the analysis and retrieval of data quality inspection results of massive data.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a massive data quality checking method and system thereof. Background technique [0002] Data quality is a data assessment management method based on a certain range of business and technical standards and specifications in a certain business scenario, and data quality inspection as a means. It is the basic guarantee and measurement method for data availability and data value. [0003] The power data of the State Grid increases by 60T every day, and these massive data are connected to the data center, which has accumulated a super-massive amount of data. Due to the variety of data sources, the format cannot be unified, resulting in serious data quality problems. In order to provide high-quality and applicable data services and data analysis to the data center, it is necessary to carry out data quality governance on the data, check the problematic data through data qua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/215G06F16/22G06F16/27G06F9/455
CPCG06F9/45558G06F2009/4557G06F16/215G06F16/221G06F16/278
Inventor 宋成平
Owner 睿至科技集团有限公司