Unlock instant, AI-driven research and patent intelligence for your innovation.

A method for determining a partial least squares regression latent variable number

A partial least squares and latent variable technology, applied in the field of data analysis and processing, can solve the problems of overfitting, overfitting, and difficulty in determining the number of latent variables in the quadratic regression model

Pending Publication Date: 2019-05-21
CHINA TOBACCO GUIZHOU IND
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when using this standard to select the number of latent variables, it is often encountered that the root mean square error of interactive verification has been decreasing (or the coefficient of determination has been approaching 1), or the partial least squares established by different latent variable numbers Situations where the predictive power of the regression model varies little (or the difference between the coefficients of determination is small), making it difficult to determine the number of latent variables
At this time, if the number of latent variables is still determined according to this principle, it will often result in the selection of too many latent variables, which will lead to overfitting of the partial least squares regression model.
[0006] Therefore, how to avoid the overfitting situation caused by selecting too many latent variables when establishing the partial least squares regression model is a technical problem that those skilled in the art need to solve

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for determining a partial least squares regression latent variable number
  • A method for determining a partial least squares regression latent variable number
  • A method for determining a partial least squares regression latent variable number

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The core of the present invention is to provide a method for determining the number of latent variables of partial least squares regression, which is used to avoid over-fitting caused by selecting too many latent variables when establishing a partial least squares regression model.

[0042] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0043] figure 1 It is a flow chart of the first method for determining the number of latent variables of partial least squares regression provided by the embodiment of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for determining a partial least squares regression latent variable number, which comprises the following steps: selecting a latent variable number, and selecting N sub-training sets and N sub-test sets in one-to-one correspondence with the N sub-training sets by adopting an interactive verification method on the basis of a collected sample; Establishing N sub-models by using the N sub-training sets, and predicting sub-test sets corresponding to the sub-training sets by using the sub-models corresponding to the sub-training sets; Recording a regression coefficient of each sub-model, and calculating to obtain a stability parameter corresponding to the latent variable number according to the regression coefficient of each sub-model; Selecting another latent variable number, and continuously carrying out the step of establishing a plurality of sub-models by adopting the interactive verification method based on the sample; And determining the corresponding latent variable number as the optimal latent variable number when the stability parameter is the maximum. The curve of the stability parameters changing along with the latent variable numbers has the characteristic of ascending first and descending second, so that a worker can determine the optimal latent variable number conveniently, and a model with good stability is established.

Description

technical field [0001] The invention relates to the field of data analysis and processing, in particular to a method for determining the number of partial least square regression latent variables. Background technique [0002] Data analysis refers to the process of analyzing a large amount of collected data with appropriate statistical analysis methods, extracting useful information and forming conclusions to study and summarize the data in detail. In practical terms, data analysis can help people to make judgments so that appropriate actions can be taken. [0003] In some specific fields, the chemical data to be dealt with in modern analytical chemistry is often high-dimensional data with a small number of samples but a large number of variables. Such data is known as the "large p, small n" problem, which is a very challenging problem for statistics. [0004] Partial Least Squares Regression (PLSR) is a common method that can deal with problems with more variables than sa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/18
Inventor 张辞海彭黔荣胡芸刘娜
Owner CHINA TOBACCO GUIZHOU IND