Data missing value filling method based on functional dependence and clustering

A function-dependent and data-missing technology, applied in the field of data processing, to achieve the effect of stabilizing clustering results and improving accuracy

Pending Publication Date: 2022-07-12
ZHENGZHOU UNIVERSITY OF LIGHT INDUSTRY
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in terms of capturing the dependencies between attributes, the hybrid missing value filling method can still be improved.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data missing value filling method based on functional dependence and clustering
  • Data missing value filling method based on functional dependence and clustering
  • Data missing value filling method based on functional dependence and clustering

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0053] like figure 1 As shown, a method for imputing missing values ​​in data based on functional dependence and clustering includes the following processes:

[0054] S1. Check the data set to be processed, if there are missing values ​​in the data set, automatically divide the data set into a complete data subset D complete and the incomplete data subset D missing ;

[0055] S2. For the complete data subset D obtained in S1 complete Perform processing, obta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data missing value filling method based on functional dependence and clustering, which comprises the following steps: checking a data set to be processed to obtain a complete data subset and an incomplete data subset; a function dependency set is obtained from the complete data subset by adopting an HYFD algorithm, and function dependencies are sorted in an ascending order; judging whether missing attributes in the incomplete tuple exist in an RHS set of a function dependency set or not; if the missing attribute exists in the RHS set, the incomplete tuple is filled by using the complete tuple in the complete data subset; if the missing attribute does not exist in the RHS set, performing clustering processing on the complete data subset through an improved AP clustering algorithm; using a KNN algorithm to process the incomplete tuple, and finally using the complete tuple to perform filling; the accuracy of the missing value filling algorithm is effectively improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a method for filling missing data values ​​based on functional dependence and clustering. Background technique [0002] With the continuous development of machine learning and deep learning technology, big data technology is applied in various fields in the industry, and some decisions are executed according to the results of data analysis and data prediction, and the demand for high-quality data is constantly increasing. However, in real life, there are a large number of missing values ​​in the collected data due to a series of reasons such as monitoring errors of industrial equipment, the influence of natural conditions, and manual misoperation. This situation occurs in various fields, such as medical, financial , electricity, etc. The lack of data will seriously affect the data quality of the entire data set, causing a certain degree of deviation between the results o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/28G06F16/215
CPCG06F16/285G06F16/215
Inventor 吴怀广李帅超史雯隽杜少卿
Owner ZHENGZHOU UNIVERSITY OF LIGHT INDUSTRY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products