Unlock instant, AI-driven research and patent intelligence for your innovation.

Data consistency evaluation method based on data distribution fluctuation ratio

A technology of data distribution and volatility, applied in database update, electronic digital data processing, structured data retrieval, etc., can solve problems such as errors, inability to find data, data loss and modification, etc.

Pending Publication Date: 2020-05-01
UESTC COMSYS INFORMATION
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Business system bugs or errors in the etl process lead to some data loss or modification errors
Usual evaluation methods cannot find such anomalous data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data consistency evaluation method based on data distribution fluctuation ratio
  • Data consistency evaluation method based on data distribution fluctuation ratio
  • Data consistency evaluation method based on data distribution fluctuation ratio

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] In order to facilitate those skilled in the art to understand the technical content of the present invention, the content of the present invention will be further explained below with reference to the accompanying drawings.

[0017] First, the application scenario of the present invention is introduced, and the present invention can be used in any scenario where it is necessary to evaluate the magnitude of the change in the number of data value patterns in a field compared with the past.

[0018] In this embodiment, the content of the present invention is described in detail by taking a "student status change subclass table T", including fields "student number F2", "change situation F1", and "change time F0" as an example. Among them, the range of F0 value is [2010-9-1, 2019-8-30], and the F1 value mode includes "study abroad", "apply by myself", "leave school without permission", "expiration of school leave", "study status clearing" ", "poor grade" The median value pat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data consistency evaluation method based on a data distribution fluctuation ratio, is applied to the field of big data analysis and processing, and aims to solve the problemof data loss or modification errors caused by errors in a bug or etl process of a service system in the prior art. The method comprises the following steps: firstly, dividing to-be-tested data into historical data and current data according to a timestamp field; then, analyzing the current proportion and the past proportion of different value modes in the to-be-tested data, and comparing the change amplitude of the proportion with a given threshold value; if the change amplitude of the value mode proportion of certain data is greater than a threshold value, considering that the data has a consistency problem; otherwise, the data is normal; according to the method, some data loss or modification errors caused by errors in the bug or etl process of the service system can be quickly and effectively found out.

Description

technical field [0001] The invention belongs to the field of big data analysis and processing, and particularly relates to a consistency evaluation technology for structured data. Background technique [0002] Structured data is simply a database. It is easier to understand when combined with typical scenarios, such as enterprise ERP, financial system; medical HIS database; education card; government administrative approval; other core databases, etc. [0003] It basically includes high-speed storage application requirements, data backup requirements, data sharing requirements, and data disaster recovery requirements. [0004] Structured data, also known as row data, is data that is logically expressed and implemented by a two-dimensional table structure. It strictly follows the data format and length specifications, and is mainly stored and managed through relational databases. In contrast to structured data, it is unstructured data that is not suitable for being represen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06F16/23
CPCG06F16/215G06F16/2365G06F16/2322
Inventor 唐雪飞蒲高飞黄永鑫王东方胡茂秋
Owner UESTC COMSYS INFORMATION