Data quality evaluation method based on information entropy value

A technology of data quality evaluation and information entropy value, applied in the field of data processing, can solve the problem of inability to judge the poor modeling data

Pending Publication Date: 2020-11-20
格创东智(深圳)科技有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a data quality evaluation method based on information entropy value, which is used to solve the problem that when the quality of modeling data is not good, the model score is significantly affected, but the specific reasons for the bad modeling data cannot be judged. The technical scheme is as follows:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data quality evaluation method based on information entropy value
  • Data quality evaluation method based on information entropy value
  • Data quality evaluation method based on information entropy value

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

[0054] The present invention provides a data quality evaluation method based on information entropy value, which is used to solve the problem that when the quality of modeling data is not good, the model score is significantly affected, but the specific reason for the bad modeling data cannot be judged.

[0055] The embodiment of the present invention proposes a data quality evaluation method based on information entropy value, such as figure 1 and figure 2 As shown, the method evaluates the data set before the data is modeled, including:

[0056] S1. Evaluate the data volume of the data set required for modeling, and obtain the data volume evaluation score;

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data quality evaluation method based on an information entropy value, which is used for evaluating a data set before modeling data, and comprises the following steps of: performing data volume evaluation on the data set required by modeling to obtain a data volume evaluation score; evaluating the effectiveness of the factors in the data set on the basis of the data volumeevaluation result to obtain factor evaluation scores; evaluating a dependent variable in the modeling data to obtain a dependent variable evaluation score; and obtaining a comprehensive score of the data set by utilizing the data volume evaluation score, the factor evaluation score and the dependent variable evaluation score, and judging the quality of the data set by taking the comprehensive score as a basis to obtain a judgment result of the quality of the data set.

Description

technical field [0001] The invention proposes a data quality evaluation method based on information entropy value, which belongs to the technical field of data processing. Background technique [0002] To realize the transformation and upgrading of intelligent manufacturing, one of the important applications is the intelligent analysis of factory quality. By collecting data collected by manufacturing enterprises, effective analysis and mining of data, and construction of models to monitor production are the current intelligent manufacturing enterprises. an important application of . Before analyzing the data, users often encounter the problem of whether the data samples meet the modeling requirements. The usual practice of analysts is to evaluate whether the data quality meets the requirements through model construction, model evaluation, and the accuracy of the final model. In general, manufacturing companies usually separate data collection and data analysis, and the staf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06Q10/06G06Q50/04
CPCG06Q10/06395G06Q50/04Y02P90/30
Inventor 翟伟辰何军
Owner 格创东智(深圳)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products