Multi-algorithm fusion-based missing value interpolation method

A missing value and imputation technology, applied in the field of data processing, can solve the problems that the imputed values ​​are far apart, the rationality of missing value interpolation is difficult to control, etc., and achieve the effect of reducing the number of variables, small errors, and stable missing values.

Inactive Publication Date: 2018-06-22
GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Without a comprehensive understanding of data distribution and missing conditions, it is impossible to know which method has a better imputation effect on missing values. At this time, people often choose one of many imputation methods based on past experience or at will. It is difficult to control the rationality of the imputation of missing values, especially for some key variables, the imputation values ​​obtained by different imputation methods may be very different, and the results and research conclusions obtained may be different. will be completely different

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-algorithm fusion-based missing value interpolation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The above and other technical features and advantages of the present invention will be described in more detail below in conjunction with the accompanying drawings.

[0020] Artificial neural network (ANNs), referred to as neural network, is an algorithmic mathematical model that imitates the behavior characteristics of animal neural networks and performs distributed parallel information processing. This network relies on the complexity of the system to achieve the purpose of processing information by adjusting the interconnection relationship between a large number of internal nodes.

[0021] Such as figure 1 As shown, it is a flow chart of a missing value interpolation method based on the fusion of multiple algorithms provided by the present invention. The method includes the following steps:

[0022] Step S1: Perform hierarchical clustering on all data.

[0023] This can ensure that the same type of complete data and missing data are gathered together for analysis ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-algorithm fusion-based missing value interpolation method. The method comprises the following steps of: S1, carrying out hierarchical clustering on all the pieces of data; S2, judging whether classes with missing values are records with missing values or not, and dividing the records into a missing data group and a complete data group; S3, randomly dividing data in the complete data group into a training set and a test set, and predicting the test set by using n existing interpolation methods so as to construct a certain quantity of sample sets; S4, interpolatingthe classes with missing values by using a neural network model so as to obtain a final interpolation value; S6, judging whether classes with missing values still exist or not, if the judging resultis positive, executing the step S2, and otherwise, executing step S7; and S7, ending the operation. According to the method, missing values are obtained by using a plurality of existing methods, so that the defects of artificially and subjectively selecting missing value interpolation methods are solved and the missing values can be interpolated objectively and effectively.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a missing value interpolation method based on fusion of multiple algorithms. Background technique [0002] In many studies that need to collect data, missing data is very common, and the reasons for missing data are various, mainly mechanical and human. The former is the failure of data collection or preservation due to mechanical reasons, such as data storage failure, memory damage, etc.; the latter is the lack of data caused by human subjective mistakes, historical limitations or intentional concealment, such as the interviewee in a questionnaire survey. The personnel refused to disclose the answers to the relevant questions, or the questions answered were invalid, and the data entry personnel made mistakes and omitted data. Before the investigation, it is extremely necessary to plan well and pay attention to avoiding missing data for some important data. However, for ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/18
CPCG16Z99/00
Inventor 陶波许飞月陈乐焱李青海
Owner GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products