Active learning method based on covariance matrix

A covariance matrix and active learning technology, applied in the field of machine learning, can solve problems such as difficulty in obtaining sample information, waste of data resources, difficulty in achieving ideal results in model prediction accuracy and generalization ability, and achieve high-quality results

Active Publication Date: 2020-10-30
JIANGNAN UNIV
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Traditional machine learning methods must rely on a large number of samples with complete information to build models, but in actual situations, due to artificial or environmental conditions, it is difficult to obtain complete sample information, and most of the sample information is missing; in this case , how to use a small number of labeled samples and a large number of unlabeled samples to improve model performance has become a key issue in machine learning research; if only a small number of labeled samples are used to train the model, the prediction accuracy and generalization ability of the model are difficult to achieve ideal results; In addition, ignoring a large number of unlabeled samples is a waste of data resources; therefore, it is necessary to label unlabeled samples. Commonly used algorithms include semi-supervised learning and active learning; semi-supervised learning aims to use the input and output modeling of labeled samples On the basis of extracting useful information in unlabeled samples to achieve the purpose of improving regression accuracy; however, although traditional semi-supervised learning methods improve model performance, they may increase the amount of calculation and improve the accuracy of the model to a large extent. depends on the structure of the semi-supervised model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Active learning method based on covariance matrix
  • Active learning method based on covariance matrix
  • Active learning method based on covariance matrix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] refer to figure 1 , provides a schematic diagram of the overall structure of the active learning method based on the covariance matrix, such as figure 1, an active learning method based on the covariance matrix includes collecting the butane concentration value in the debutanizer production process, and using the butane concentration value as a soft sensor modeling sample; dividing the soft sensor modeling sample into a training set and a test set , the training set is divided into labeled sample set and unlabeled sample set; the Gaussian process regression model is established by using the labeled sample set, and the initial parameters of the model are determined; according to the unlabeled sample set, the largest determinant of the covariance matrix is ​​selected. samples; re-establish the Gaussian process regression model, determine the model parameters; use the test set to predict the concentration of butane.

[0047] The parameters of the Gaussian process regressi...

Embodiment 2

[0085] In this embodiment, the active learning modeling method based on the covariance matrix proposed by the present invention is compared with the traditional Gaussian process regression active learning algorithm based on the prediction variance selection.

[0086] In order to test the difference between the two selection strategies for the selection of unlabeled samples, a regression analysis is performed on the function z=sin(2x)+cos(4y), where x and y are all subject to normal distribution; the data set is divided into 20 groups. Labeled sample sets, 400 sets of unlabeled sample sets, and 400 sets of test sample sets;

[0087] Specific process: select 5 and 10 unlabeled samples respectively for labeling; in the third iteration, the selection results of unlabeled samples are shown in Figure 4.

[0088] In Figure 4, the yellow points are the distribution points of the unlabeled sample set, the green points are the unlabeled samples selected based on the variance selection s...

Embodiment 3

[0091] In order to monitor the refining quality, it is necessary to monitor the butane concentration at the bottom of the column in real time; however, it is difficult to directly detect the outflow of the bottom material of the debutanizer, and a soft sensor model needs to be established.

[0092] During the real-time sampling process of the butane concentration in the debutanizer at the No. 81 natural gas processing station of the No. Contains only 7 process variables; 20 unlabeled samples are selected each time and added to the labeled sample set, and another 500 groups of samples are selected as test samples.

[0093] In order to verify the effectiveness of the unlabeled sample selection strategy proposed by the present invention, it is compared with the variance-based selection and random sample selection strategies.

[0094]The random sample selection is to randomly select 20 samples from the unlabeled sample set each time, and add the labeled sample set after marking to...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an active learning method based on a covariance matrix, and the method is characterized in that the method comprises the steps: collecting a butane concentration value in a debutanizer production process, and taking the butane concentration value as a soft measurement modeling sample; dividing the soft measurement modeling sample into a training set and a test set, whereinthe training set is divided into a labeled sample set and an unlabeled sample set; establishing a Gaussian process regression model by utilizing the labeled sample set, and determining initial parameters of the model; according to the label-free sample set, selecting a sample forming the maximum determinant value of the covariance matrix; re-establishing a Gaussian process regression model, and determining model parameters; and predicting the butane concentration by utilizing test set. According to the method, the effectiveness of the covariance matrix selection strategy is verified through numerical simulation analysis and application simulation of the debutanizer process.

Description

technical field [0001] The invention relates to the field of machine learning and is applied to the soft measurement of butane concentration in a debutanizer, in particular to an active learning method based on a covariance matrix. Background technique [0002] Traditional machine learning methods must rely on a large number of samples with complete information to build models, but in actual situations, due to artificial or environmental conditions, it is difficult to obtain complete sample information, and most of the sample information is missing; in this case , how to use a small number of labeled samples and a large number of unlabeled samples to improve model performance has become a key issue in machine learning research; if only a small number of labeled samples are used to train the model, the prediction accuracy and generalization ability of the model are difficult to achieve ideal results; In addition, ignoring a large number of unlabeled samples is a waste of data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N20/00G06F17/16
CPCG06N20/00G06F17/16
Inventor 熊伟丽周博文马君霞
Owner JIANGNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products