A Classification Method for Identification of Public Opinion Tendency Aiming at Unbalanced Category Distribution

A classification method and tendency technology, which is applied in text database clustering/classification, character and pattern recognition, unstructured text data retrieval, etc., can solve the deviation between the effect of tendency recognition and the actual tendency, and the unbalanced text of training data Issues such as timeliness of release, timeliness of public opinion, and no solutions have been proposed to achieve the effect of improving classification accuracy and better recognition

Active Publication Date: 2021-04-16
WUHAN UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] When using general-purpose machine learning algorithms to analyze public opinion tendencies, problems such as class imbalance of training data, timeliness of text release, and timeliness of public opinion often lead to a large deviation between the effect of tendency recognition and the actual tendency
Currently, no effective solution has been proposed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Classification Method for Identification of Public Opinion Tendency Aiming at Unbalanced Category Distribution
  • A Classification Method for Identification of Public Opinion Tendency Aiming at Unbalanced Category Distribution
  • A Classification Method for Identification of Public Opinion Tendency Aiming at Unbalanced Category Distribution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.

[0034] please see figure 1 , the present invention provides a method for identifying public opinion tendencies aimed at unbalanced distribution of training sample categories, comprising the following steps:

[0035] Step 1: Use the method of manual collection to track and mark the current hot spots of public opinion, select high-frequency words related to the field of public opinion concerned as hot words of public opinion, create a public opinion high-frequency word thesaurus, and update it daily;

[0036] In this embodiment, the source of hot words in publ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a public opinion tendency recognition method aiming at the unbalanced distribution of training sample categories. First, collect words related to the field of public opinion concerned as hot words of public opinion to create a thesaurus; crawl the comment data set from the source of public opinion, and divide it into a training set and a test set. Then, the public opinion tendency of the training set is manually classified, and the bootstrap learning method is used to supplement the problem of category imbalance. Extract the characteristics of each type of training sample, use Naive Bayesian, support vector machine, decision tree and other algorithms to train the algorithm model, use the trained model to classify the test set data, and identify the tendency of public opinion according to the classification results. The methods of bootstrap learning, eigenvector construction and classification model training are all weighted by time-sensitive weighting methods, so that the public opinion tendency reflected by them is more time-sensitive. The invention solves the problem of inaccurate classification caused by unbalanced training data, and improves the accuracy of public opinion tendency recognition and the timeliness of public opinion analysis.

Description

technical field [0001] The invention belongs to the technical field of natural language processing and machine learning, and relates to a method for analyzing the tendency of public opinion by using a machine learning algorithm, in particular to a method for identifying the tendency of public opinion aimed at the unbalanced distribution of training sample categories. Background technique [0002] The current Internet penetration rate is growing rapidly, the number of updated news on the Internet is very large, and the resulting impact on public opinion is also very large. It is under this situation that the analysis technology of public opinion tendency was born, aiming at analyzing the public opinion generated on the Internet. Reviewers' tendentious attitudes and attitude changes are screened in a timely manner, thereby helping regulatory authorities to detect changes in public opinion in a timely manner and build a civilized and harmonious public opinion environment. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/289G06K9/62
CPCG06F40/289G06F18/24155
Inventor 彭蓉王卓洪涛
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products