Semi-supervised classification prediction method

A classification prediction, semi-supervised technology, applied in special data processing applications, instruments, network data indexing, etc., can solve problems such as fitting, inability to achieve the best results of the model, and achieve enhanced accuracy, reduced risk, and reduced imbalance. degree of effect

Active Publication Date: 2019-06-25
SOUTHWEST JIAOTONG UNIV
View PDF13 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] From the above methods, it can be seen that the integrated method can improve the final result under certain conditions; however, the traditional semi-supervised method cannot guarantee the best results for the model, and may even bring about overfitting. joint risk

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-supervised classification prediction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The following combination figure 1 Implementation The present invention is further described in detail.

[0038] A semi-supervised classification prediction method, which can enhance the performance of the model for classification tasks, and discretize the data in local areas by focusing on the redundancy of categories. Based on the optimization of category correlation, the discriminant criterion item of view combination feature is taken into account. like figure 1 The implementation steps of the program are as follows:

[0039] Step 1. Construct labeled data and unlabeled data:

[0040] (1) Use crawler technology to obtain data from the Internet, or use existing data sets; each sample in these data sets will include specific attribute characteristics;

[0041] (2) In the entire data set, the category of each sample will be uniquely represented by the label; among them, the sample represented by the label is called labeled data, and the sample represented by no labe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for eliminating category region redundancy based on a semi-supervised algorithm, and belongs to the field of data mining. The method is aimed at a classification prediction task, redundancy of a classification local area is reduced to optimize a target, and the risk of overfitting is reduced, so that a model with high overall discrimination capability is obtained.The implementation idea of the method is divided into two processes: firstly, starting from each category, finding the center of each category, and then selecting a local area around the center; secondly, considering the redundancy of a local area, and reducing the redundancy of the data by using a random sampling method. The method can be used for sample redundancy problems of various classifications such as disease diagnosis, text classification, face recognition and speech recognition, and the classification accuracy can be remarkably improved.

Description

technical field [0001] The invention belongs to the technical field of data mining. Background technique [0002] With the rapid development of information technology, scientific research and production practices based on big data have become common. However, as the amount of data increases, traditional supervised learning requires a large number of labeled samples, and the labeling of a large amount of data will cause a waste of human resources. Therefore, semi-supervised methods have been proposed. [0003] Semi-supervised learning simultaneously obtains valuable information from both labeled and unlabeled data, and solves the problem of labeling large amounts of data. Among them, the semi-supervised ensemble method is currently the mainstream method, and has been widely used in research fields such as disease diagnosis, text classification, face recognition, speech recognition, and web page classification. However, in some cases, the prediction results of the majority ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/2458G06F16/951G06K9/62
Inventor 杨燕汪衡
Owner SOUTHWEST JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products