Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method of classifying web text information sentiments

A technology of sentiment classification and network text, applied in the field of sentiment classification of network text information, can solve gray, destroy political stability and social harmony, vulgarity and other problems

Inactive Publication Date: 2016-12-07
CHINA ELECTRONICS TECH CYBER SECURITY CO LTD
View PDF2 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But at the same time, some hostile forces take advantage of the hidden environment of the Internet to create online public opinion that deviates from the mainstream political culture, undermining political stability and social harmony; Netizens vent their emotions and publish vulgar and gray remarks on the Internet
However, traditional manual methods are difficult to deal with the collection and judgment of massive information on the Internet. Therefore, automated sentiment classification methods are needed to judge Internet public opinion information and realize the key discovery of reactionary, sensitive, negative and other public opinion information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of classifying web text information sentiments
  • Method of classifying web text information sentiments
  • Method of classifying web text information sentiments

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] A method for emotional classification of network text information, such as figure 1 and figure 2 shown, including the following steps:

[0016] Step 1. First, determine whether the document belongs to news. If it belongs to news, only the title is extracted for sentiment classification, otherwise, the entire document is sentiment classified;

[0017] Step 2. Preprocess the documents that need to be classified:

[0018] Preprocessing refers to using the Chinese lexical analysis system ICTCLAS to segment the text, and then filter the stop words:

[0019] To perform sentiment classification on the target document, data preprocessing is firstly required, which mainly includes word segmentation and removal of stop words. The ICTCLAS Chinese word segmentation system developed by the Institute of Computing Technology, Chinese Academy of Sciences is based on the Hierarchical Hidden Markov Model, which can effectively segment the input text and output it following the part o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method of classifying web text information sentiments, comprising the following steps of 1, judging whether a document is news, if yes, just extracting a title for sentiment classification, and if not, carrying out sentiment classifying on the whole document; 2, preprocessing the document to be classified; 3, classifying the document according to text length: calculating a feature weight for a document longer than 140 characters by using TF-IDF (term frequency-inverse document frequency), and carrying out classification by using a trained logistic regression classifier; and carrying classification on a document longer than 140 characters by using manual sentiment classification rules. Compared with the prior art, the method has the advantages that a technical route combining a classifier and field expert formulated classification features is constructed by using machine learning algorithm according to different features of long and short texts, and it is possible to timely find related reactionary information, sensitive information and negative information in online public opinions.

Description

technical field [0001] The invention belongs to the field of natural language processing and relates to a method for emotional classification of network text information. Background technique [0002] As a new type of media, the Internet has played more and more roles in smoothing public opinion, expressing appeals, supervising public opinion, and participating in state affairs. More and more people use the Internet to express their appeals in interests, politics, etc. One's own attitudes or views on hot social issues such as people's livelihood, justice, and anti-corruption. Especially when mass or unexpected events occur, people often pass or obtain information through the Internet in the first place. But at the same time, some hostile forces take advantage of the hidden environment of the Internet to create online public opinion that deviates from the mainstream political culture, undermining political stability and social harmony; Netizens vent their emotions and publi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 姚春华杨颖唐明芳陈小玉鄢秋霞
Owner CHINA ELECTRONICS TECH CYBER SECURITY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products