Fast structural SVM text classification optimization algorithm

A technology of support vector machine and text classification, applied in the field of fast structured support vector machine text classification optimization algorithm, can solve the problems of ignorance, complex model, poor performance, etc., to achieve the effect of improving classification accuracy and improving classification performance

Inactive Publication Date: 2017-03-22
SUN YAT SEN UNIV +2
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Third, although F value and AUC can be used as good evaluation indicators, you can consider training a text classification model by directly optimizing F value or AUC, which is similar to Ranking SVM and other methods to a certain extent, and they aim to reduce training errors. , but the performance is poor when the training samples are unbalanced, and the classifier does not distinguish well from the training instances generated by the small class samples, or even turns a blind eye
But not all integration methods are effective. Even if the F value and AUC can be improved, the combination of multiple classifiers leads to extremely complex models, high time complexity, and poor explainability.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast structural SVM text classification optimization algorithm
  • Fast structural SVM text classification optimization algorithm
  • Fast structural SVM text classification optimization algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] Such as figure 1 As shown, a fast structured support vector machine text classification optimization algorithm includes the following steps:

[0055] S1: Extend the objective function of the structured support vector machine to obtain the dual form of the structured support vector machine;

[0056] S2: Use the obtained dual form to optimize the objective function of the structured support vector machine through the coordinate ascent method.

[0057] In text classification, the implementation of SVM (Support Vector Machine) is the process of finding the largest hyperplane in the data set, which is analyzed for linearly separable cases. For the case of linear inseparability, the linearly inseparable samples of the low-dimensional input space are converted into high-dimensional feature spaces by using nonlinear mapping algorithms to make them linearly separable, so that the high-dimensional feature spaces use linear algorithms to linearize the nonlinear features of sampl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a fast structural SVM text classification optimization algorithm. For a text classification task of an unbalanced data set, the algorithm directly optimizes the performance evaluation index of the main category by using the performance evaluation method such as on an accuracy rate, a recall rate, an AUC, and the like. The method is different from most conventional text classification algorithms that: instead of learning a single rule to predict a tag of a single sample, the method formalizes the learning problem into a multiple predictive problem on all the samples in the data set, and is different from idea in the conventional method that the reduction of overall classification error rate is taken as the target, so that classification accuracy of the text data set under the unbalanced condition is improved, and the classification performance is effectively improved; and referring to a Structural SVM based sparse approximation algorithm, with better time complexity, the method can be used for evaluation indexes such as the F value calculated from the accuracy rate and the recall rate, and the optimization of the AUC, so that the time complexity is reduced and better results are obtained.

Description

technical field [0001] The invention relates to the field of text classification, and more specifically, relates to a fast structured support vector machine text classification optimization algorithm. Background technique [0002] With the rapid development of Internet technology, the classification, organization and management of massive data has become a topic of great research significance. Among these data, text data is the largest category, including news reports, user comments, emails, advertisements, and so on. These communication carriers containing a large amount of text content require a complete classification mechanism to manage and filter information, so text classification came into being. Text classification refers to the classification of texts into pre-defined categories by computer according to some automatic classification algorithm based on the content of the text, which is an important topic in the fields of information storage and information retrieval...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F18/2411
Inventor 郭泽颖柯戈扬印鉴
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products