Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Multivariate Calibration Feature Wavelength Selection Method Based on Minimum Correlation Coefficient

A technology of correlation coefficient and characteristic wavelength, applied in the field of near-infrared spectrum wavelength selection, can solve problems such as difficult to understand the principle, complex operation, collinearity of variables, etc., to achieve the effect of improving robustness and prediction accuracy, and reducing costs

Active Publication Date: 2021-09-03
HEILONGJIANG BAYI AGRICULTURAL UNIVERSITY
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the successive projections algorithm (successive projections algorithm, SPA) is a wavelength selection algorithm that minimizes collinearity among variables through vector projection analysis, but its principle is not easy to understand, and its operation is more complicated. The rest of the algorithms also have variables. The problem of collinearity among them, people are also doing related research

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Multivariate Calibration Feature Wavelength Selection Method Based on Minimum Correlation Coefficient
  • A Multivariate Calibration Feature Wavelength Selection Method Based on Minimum Correlation Coefficient
  • A Multivariate Calibration Feature Wavelength Selection Method Based on Minimum Correlation Coefficient

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] The near-infrared spectroscopy data of soil samples are public and come from the website Quality&Technology. The sample data contains two parts, the NIR spectrum and chemical properties of the sample. There are 108 samples in total. The wavelength range of the sample spectrum is 400-2500nm, the sampling interval is 2nm, and there are 1050 wavelength points in total. The near-infrared spectrum diagram is as follows figure 2 shown. The present invention uses soil organic matter content as a dependent variable to carry out wavelength selection and near-infrared spectrum data modeling prediction analysis to prove the effectiveness of the method.

[0039] Step 1: Divide 108 samples into 75% modeling set and 25% validation set, the modeling set contains 81 samples, and the validation set contains 27 samples. In order to correct the spectral baseline, eliminate the interference of other backgrounds, and improve the spectral resolution, the original spectral data is preproc...

Embodiment 2

[0057] A set of near-infrared spectral data of publicly available grains from the website EigenVector. The data set includes 80 grain samples measured by three different near-infrared spectrometers. The wavelength range of the sample spectrum is 1100-2498nm, and the sampling interval is 2nm, with a total of 700 wavelength points. Chemical properties include moisture, oil, protein and starch values. In this example, the near-infrared spectrum measured by the instrument mp6 is selected, and the starch content in the grain is used as the dependent variable to carry out wavelength selection, spectral data modeling, and predictive analysis to illustrate the effectiveness of this method.

[0058] Step 1: Divide 80 samples into 75% training set and 25% validation set, the modeling set contains 60 samples, and the validation set contains 20 samples. In order to correct the spectral baseline, eliminate the interference of other backgrounds, and improve the spectral resolution, the ori...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a multivariate correction characteristic wavelength selection method based on the minimum correlation coefficient, and aims to solve the problems of the existing wavelength selection method. The steps of the present invention are as follows: to spectral dataset X Perform S-G first-order derivative processing, calculate the absolute value of the correlation coefficient between the column vectors, and obtain the correlation coefficient matrix R , to calculate the correlation coefficient matrix R The average value and standard deviation of other elements in each column except the diagonal line, select the correlation coefficient average value and standard deviation threshold value pair to form a set of wavelengths to be selected S ,right S Set the wavelengths to sort to get the set S' , increasing a wavelength variable successively to establish the MLR model, and calculating the RMSEV value, minimum RMSEV The subset of variables corresponding to the value is S Under the characteristic wavelength, select the next threshold pair, repeat the above steps, and find the corresponding minimum value under all the characteristic wavelength sets. RMSEV value and its corresponding characteristic wavelength. The variable selection method for reducing redundancy to the greatest extent in the present invention has simple principle and is easy to implement.

Description

technical field [0001] The invention relates to the field of near-infrared spectrum wavelength selection, in particular to a multivariate correction characteristic wavelength selection method based on the minimum correlation coefficient. Background technique [0002] In recent years, near-infrared spectroscopy has been widely used in petrochemical, pharmaceutical, environmental, clinical, agricultural, food and biomedical fields. The near-infrared spectral region (800-2500nm) is mainly composed of double frequency and combined frequency absorption peaks of hydrogen-containing groups. The absorption intensity is weak and the sensitivity is low. There will be deficiencies such as multicollinearity or too many non-informative variables. Selecting the characteristic wavelengths for the full spectrum is to reduce data redundancy and multicollinearity, which can improve the prediction accuracy of the model and simplify the complexity of the model. [0003] Common variable select...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G01N21/359
CPCG01N21/359G06F18/213
Inventor 陈争光
Owner HEILONGJIANG BAYI AGRICULTURAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products