A triple-optimal semi-supervised regression algorithm based on a self-training framework

A regression algorithm and self-training technology, applied in computing, computer parts, instruments, etc., can solve the problem of less cost of obtaining label samples

Inactive Publication Date: 2019-03-29
JIANGNAN UNIV
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Aiming at the problem that there are few labeled samples and high acquisition costs in the industrial process, and the traditional semi-supervised learning cannot guarantee sufficient and accurate prediction of unlabeled samples, the present invention proposes a three-optimized semi-supervised regression algorithm under the self-training framework

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A triple-optimal semi-supervised regression algorithm based on a self-training framework
  • A triple-optimal semi-supervised regression algorithm based on a self-training framework
  • A triple-optimal semi-supervised regression algorithm based on a self-training framework

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Combine below figure 1 Shown, the present invention is described in further detail:

[0024] A three-optimized semi-supervised regression algorithm under a self-training framework, comprising the following steps:

[0025] Step 1: Screen unlabeled samples and labeled samples, and use the filtered labeled samples to establish a Gaussian process regression model f 1 , use the model to predict the unlabeled sample set M 1 label value to get the pseudo-label sample set S 1 ;

[0026] Unlabeled sample screening: given a threshold θ 1 , using the Mahalanobis distance to measure the unlabeled sample x′ i The similarity d with the center C of the labeled sample dense area i , if x′ i The distance from C is less than θ 1 , then x′ i Satisfy the preferred conditions;

[0027] Labeled sample screening: given a threshold θ 2 , using the Mahalanobis distance to measure the similarity d(x i ,x j ), statistical sample x i with surrounding samples x j The Mahalanobis dist...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a triple optimization semi-supervised regression algorithm under a self-training framework, which relates to the technical field of semi-supervised regression algorithm. The invention screens the unlabeled samples and the labeled samples, establishes a Gaussian process regression model, predicts the label value of the unlabeled sample set by using the model, and obtains afalse label sample set. Selecting the sample set of pseudo-label satisfying the condition by using the criterion of confidence degree; Further selecting a sample set of false tags with high credibility by judging confidence level, At that same time, the tag sample set and the untag sample set are updated, the untag sample set and the tag sample set are re-screened and the threshold is updated by utilizing the Gaussian process regression model, the false tag sample set of the untag sample set is predicted, the confidence degree of the false tag sample set is judged, and the self-training cycleis entered until the set cycle times are reached. The invention realizes the confidence judgment of the false label sample, and introduces a self-training frame to improve the utilization rate of thelabel-free sample, so as to improve the prediction effect of the model after using the label-free sample.

Description

technical field [0001] The invention relates to the technical field of semi-supervised regression algorithms, in particular to the technical field of three-optimized semi-supervised regression algorithms under the self-training framework. Background technique [0002] Some important quality variables in industrial processes such as chemical industry, metallurgy, and fermentation cannot be measured by online instruments, and there is a serious lag in offline analysis in the laboratory. Therefore, it is necessary to use some sample data that can be directly measured. important quality variables to predict. With the development of science and technology, especially the development of industrial big data technology, unlabeled samples are becoming more and more easy to obtain in large quantities, while the cost of obtaining labeled samples is still high, resulting in few labeled samples in some industrial processes. It is difficult for modeling methods to guarantee the predictio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/2155
Inventor 熊伟丽程康明马君霞
Owner JIANGNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products