Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Multiple interpolation method for soil data set

An interpolation and multiple technology, which is applied in the fields of electronic digital data processing, digital data information retrieval, special data processing applications, etc., can solve the problems that the data is no longer so accurate, and the uncertainty of interpolation is not considered.

Pending Publication Date: 2022-06-07
GUILIN UNIVERSITY OF TECHNOLOGY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Single imputation methods include mean substitution method, regression interpolation method, K-nearest neighbor interpolation method, etc. Single imputation does not consider the uncertainty caused by the interpolation process
But if the missing data goes beyond missing completely at random, the data obtained by single imputation are no longer so accurate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multiple interpolation method for soil data set
  • Multiple interpolation method for soil data set
  • Multiple interpolation method for soil data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] Step 1: Estimate incomplete data matrix X using an improved K-neighbor multiple interpolation method to obtain parameters k and m;

[0042] Step 2: Randomly select a missing value in data matrix X, i.e. x is ;

[0043] Step 3: Calculate the interpolation estimate of the missing value using the improved K-neighbor multiple imputation method, and obtain the interpolation estimate by formula (4), i.e

[0044] Step 4: The value x will be missing is Replace with the interpolated estimate Update matrix X to matrix X * ;

[0045] Step 5: Randomly select matrix X * The next missing value in , the process is repeated until all missing values of matrix X in the original data have been estimated.

[0046] Step 6: Repeat steps 2-5 to get M estimated data sets.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an improved K neighbor-based multiple soil inorganic salt proportion data set interpolation method. At present, a large amount of missing values exist in a real database, which not only seriously affects the quality of information query, but also distorts the results of data mining and data analysis, so that workers are misguided to make decisions. The optimal method for solving the problem is to fill the lost data in advance. Multiple interpolation is proved to be an effective strategy for processing a data missing problem and solving interpolation uncertainty, and under the condition of processing high-dimensional data, data missing can cause more serious problems. In this case, the invention provides a multi-interpolation method based on improved K-nearest neighbor, the distance is calculated by using the related information between the target and the candidate predictive factor, and the method is also suitable for the case of high-dimensional data missing due to the fact that only the related predictive factor contributes to the calculation of the distance.

Description

Technical field [0001] The present invention relates to the field of big data dataset filling, specifically a multiple interpolation method based on improved K neighbors. Background [0002] With the development of the information age, big data gradually penetrates into various industries, due to storage equipment damage, data admission violations or data collection irregularities, data collection equipment capacity limitations and other subjective and objective reasons caused by data loss, for missing values, there are different degrees of data in the database, reducing the availability of data. At the same time, most of the existing data analysis tools are based on complete data sets, can not directly deal with incomplete data sets containing missing data, the traditional processing method is to only keep complete records for analysis and query, directly discard missing data is simple and easy, but when the proportion of missing data is large, this method will cause distortion ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/215G06K9/62
CPCG06F16/215G06F18/2413
Inventor 程小辉张皓然
Owner GUILIN UNIVERSITY OF TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products