Nearest neighbor filling method of non-fixed k values

A filling method and nearest neighbor technology, applied in neural learning methods, special data processing applications, instruments, etc., can solve problems such as unreasonable distance calculations, and achieve an easy-to-implement effect

Inactive Publication Date: 2014-01-29
GUANGXI NORMAL UNIV
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method can solve the problem that the distance calculation is unreasonable and the nearest neighbor k value is the same for all missing instances

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Nearest neighbor filling method of non-fixed k values
  • Nearest neighbor filling method of non-fixed k values
  • Nearest neighbor filling method of non-fixed k values

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] First, various hybrid distance calculations. Divide common attributes in research into five categories: continuous, symmetric binary, asymmetric binary, disordered discrete, and ordered discrete. The distance of the present invention is defined as follows: in Represents whether there is missing phenomenon in case i and j, if there is, it is 0, otherwise it is 1. f is the attribute of the fth category among the five categories of attributes, n is the number of attributes, and d ij f That is, the distance between case i and j class f attributes.

[0025] A. Continuous value distance calculation

[0026] The distance calculation formula for two consecutive value instances: where n represents that there are n continuous attributes in cases i and j, A i,k is the attribute value of the kth attribute of case i, is the average of n consecutive attributes in case i.

[0027] B. Symmetric Binary and Asymmetric Binary Attribute Distance Computation

[0028] If the di...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a nearest neighbor filling method of non-fixed k values. The method mainly aims at overcoming the defects of an existing nearest neighbor filling method. The method comprises the steps that firstly, reasonable definitions are performed on various attribute distance computational formulas; then, an appropriate k value is selected with regard to each missing instance by using the mode of sparse coding, and meanwhile the attributes which most conform to the missing instances are selected; finally, k non-missing instances closest to the missing instances are selected for missing value filling according to the obtained k values. According to the method, the instance problem of missing data filling can be solved, furthermore, the reasonability of missing value filling can be enhanced, and filling quality can be improved without increasing filling complexity. The method is easy to implement and only refers to some simple mathematical models when codes are edited.

Description

technical field [0001] The invention relates to the field of computer science and technology and the field of information technology, in particular to a method for filling missing data using the nearest neighbor method with a non-fixed k value. Background technique [0002] The principle of the nearest neighbor algorithm (kNN) can be described as follows: the relationship between two instances with the closest distance is the closest. Therefore, if a case is missing (whether it is missing in a condition attribute or a decision attribute), you can calculate its distance from other cases in the data set that are not missing, and then find the case that is closest to it. Finally, the value of the missing data is used. The nearest instance of the value on the attribute (discrete attribute) or average (continuous attribute) instead. [0003] Since the nearest neighbor method is a lazy learning method (Lazy Learning) based on instance learning, it does not actually construct a cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06N3/02G06N3/08
Inventor 张师超朱晓峰刘星毅
Owner GUANGXI NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products