KNN-based improved missing data filling algorithm

A missing data and algorithm technology, applied in data mining, digital data processing, special data processing applications, etc., to achieve accurate results, accurate calculation results, and wide applicability
CN106407464AInactive Publication Date: 2017-02-15NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
Publication Date
2017-02-15
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a KNN-based improved missing data filling algorithm, which comprises the steps of (1) improving a traditional multiple correlation coefficient inverse weighting method and calculating the importance of each attribute on a missing value-containing attribute by using an improved algorithm, deleting a few of attributes with relatively small correlation with a key attribute and carrying out streamlined operation on an attribute set to obtain a data sample set which only contains the streamlined attribute set; (2) comprehensively considering the advantages of the correlation between the attributes and the variability by using a mahalanobis distance, effectively predicting an uncertain factor-containing sample by combining a grey correlation analysis method and calculating K adjacent samples of a missing sample; and (3) giving entropy weight values to the attributes corresponding to the K samples according to the calculated K distance values and an entropy weight method and then calculating a final filling value by combining attribute values. According to the KNN-based improved missing data filling algorithm, the calculating complexity of the missing data algorithm can be reduced, the accuracy of the adjacent sample values is improved and the estimation accuracy of the data filing value is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the field of missing data filling, in particular to an improved missing data filling algorithm based on KNN. Background technique

[0002] In practical applications, due to differences in data acquisition methods or data modeling, the obtained data is marked as "unknown" or directly vacant because it does not fully conform to the previously defined format, and these data are called incomplete data. or missing values. Missing values ​​generally exist in related fields such as medicine, survey research, industry, etc. Inaccurate measurement methods, limitations of collection conditions, omissions of manual entry, etc. may lead to missing data. Excavation work will have very adverse effects. For example, missing values ​​may directly affect the accuracy of newly discovered patterns, leading to wrong mining models. In association rules, the unknown of missing values ​​will interfere with the normal data distribution and affect t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More