Data quality detection method and system based on multi-dimensional label

A data quality and detection method technology, applied in the field of data processing, can solve problems such as poor accuracy and weak timeliness, and achieve the effect of improving quality, improving timeliness, and reducing dirty data

Active Publication Date: 2020-08-21
XIAMEN MEIYA PICO INFORMATION
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of this application is to propose a data quality detection method and system based on multi-dimensional tags to solve the problems of poor accuracy and weak timeliness caused by fixed detection rule templates

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data quality detection method and system based on multi-dimensional label
  • Data quality detection method and system based on multi-dimensional label
  • Data quality detection method and system based on multi-dimensional label

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, rather than to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.

[0045] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0046] figure 1 A flow chart of a method for detecting data quality based on multidimensional tags according to an embodiment of the present application is shown. Such as figure 1 As shown, the method includes the steps of data item classification, dimension label analysis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data quality detection method and system based on a multi-dimensional label. Based on the known type data items and the detection rule base, a multi-dimensional label analysis algorithm is used for marking corresponding dimension labels on the known type data items, and the dimension labels are used for dynamically adjusting the quality detection process of the known typedata items; a quality detection engine is recommended for the unknown type data source by using a rule similarity evaluation algorithm based on the unknown type data item and in combination with a detection rule base, and a result of the quality detection engine is verified to obtain an effective quality detection rule set; and the quality detection process and the effective quality detection rule set of the known type of data items are stored, and the multi-dimensional label rule base is updated. According to the scheme, through a multi-dimensional label algorithm and a rule similarity evaluation algorithm, the problems of poor accuracy, weak timeliness and the like caused by a fixed detection rule template are solved, rapid and accurate detection of data quality is realized, a detectionresult is fed back in time, and the quality of a data source is improved.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to a multi-dimensional label-based data quality detection method and system. Background technique [0002] "Big data" requires a new processing model to have stronger decision-making power, insight discovery and process optimization capabilities, which makes big data a massive, high-growth and diversified information asset. As big data systems in various places are continuously connected to different industries, raw data from a variety of data sources are generated, and after reprocessing, the final information assets are formed. The quality of each data source is the basis for the effectiveness of the big data system. How to quickly and accurately detect whether there are quality problems in various data sources, give early warning and improve the quality of data sources, and reduce the proportion of dirty data in the final information assets , is the key to whet...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F16/28
CPCG06F16/215G06F16/283Y02P90/30
Inventor 林文楷周成祖乔赞瑞王海滨吴朝晖齐战胜
Owner XIAMEN MEIYA PICO INFORMATION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products