Emotion classification model training and textual emotion polarity analysis method and system

A technology of emotion classification and emotion polarity, which is applied in semantic analysis, neural learning methods, biological neural network models, etc., can solve the problems of dimension disaster, semantic information cannot be expressed, and the potential connection of words cannot be revealed, so as to improve accuracy, Avoiding the disaster of dimensionality and reducing the effect of dimensionality

Inactive Publication Date: 2016-04-20
RUN TECH CO LTD BEIJING
View PDF4 Cites 46 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Because the word vector sparse representation often encounters the curse of dimensionality when solving practical problems, and the semantic information cannot be represented, and the potential connection between words cannot be revealed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Emotion classification model training and textual emotion polarity analysis method and system
  • Emotion classification model training and textual emotion polarity analysis method and system
  • Emotion classification model training and textual emotion polarity analysis method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] see Figure 1A The solution of this embodiment can be implemented by a computer, specifically, it can be implemented by a software program configured in a computer, and the method for training an emotion classification model includes the following steps:

[0046] S110: Collect data from the corpus to obtain original data.

[0047] Exemplarily, the original analysis data may be obtained by crawling content in the corpus through a crawler tool, or the original analysis data may be obtained through other data collection methods.

[0048] A crawler can be a program that automatically obtains web content, or it can be an important part of a search engine. Search engines use crawlers to search for web content, and HTML (HyperTextMark-upLanguage, Hypertext Markup Language) documents on the web are connected using hyperlinks, just like weaving a web, and crawlers crawl along this web, every time they arrive at A web page is grabbed by a crawler program, and then the content is...

Embodiment 2

[0065] On the basis of Embodiment 1 of the present invention, this embodiment further provides a preferred implementation manner of step S120 in the technical solution of Embodiment 1, that is, performing preprocessing on raw data to obtain preprocessed data.

[0066] Referring to Embodiment 1 of the present invention, as figure 2 As shown, step S120 is to preprocess the original data, and obtaining the preprocessed data may include:

[0067] S121: Clean the original data to obtain the cleaned data.

[0068] Exemplarily, the unrecognizable data and non-literal characters in the raw data previously acquired by the crawler tool are removed to obtain the cleaned data, which is convenient for subsequent word segmentation, stop word removal and word vector extraction operations.

[0069] S122: Perform word segmentation and stop word removal processing on the cleaned data to obtain preprocessed data.

[0070] Exemplarily, an open source word segmentation tool or a purchased non-o...

Embodiment 3

[0073] On the basis of the second embodiment, this embodiment further provides a preferred implementation manner of step S121 in the technical solution of the second embodiment, that is, cleaning the original data and obtaining the cleaned data.

[0074] Referring to Embodiment 2 of the present invention, as image 3 As shown, step S121, that is, cleaning the original data, and obtaining the cleaned data may include:

[0075] S1211: Delete HTML tags and URLs in the original data.

[0076] Exemplarily, the Hypertext Markup Language (HyperTextMark-upLanguage, HTML) tags and Uniform Resource Locator (UniformResourceLocation, URL) etc. in the raw data have nothing to do with the sentence itself, nor do they constitute words, so it is necessary to convert the above HTML Tags and URLs are deleted to facilitate the subsequent operation of extracting word vectors.

[0077] S1212: When the content in the corpus is Chinese, convert traditional characters in the original data into simp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an emotion classification model training and textual emotion polarity analysis method and system. The emotion classification model training method comprises the steps that data are acquired from a corpus so that original data are obtained; the original data are preprocessed so that preprocessed data are obtained; word vectors are extracted from the preprocessed data through a neural network model; the word vectors are fused according to preset fusion rules so that sentence vector characteristics are generated; and an emotion classification model is trained according to the sentence vector characteristics so that the trained emotion classification model is obtained. The neural network model is adopted, the words are expressed by low-dimensional spatial vectors, the low-dimensional spatial word vectors are fused into the sentence vector characteristics according to the preset rules, and the emotion classification model is obtained by certain learning models through training so that word vector dimension can be effectively reduced, the dimensions disaster problem can be avoided, correlative attributes between the words can be mined and vector semantic accuracy can be enhanced.

Description

technical field [0001] The invention relates to the technical field of data mining, in particular to a method and system for training an emotion classification model and a method and system for analyzing text emotion polarity. Background technique [0002] Sentiment analysis, also known as tendency analysis, is the process of analyzing, processing, summarizing and reasoning subjective texts with emotional color. Common sentiment analysis includes opinion extraction, opinion mining, sentiment mining, and subjective analysis. [0003] In terms of financial information analysis, investors have long recognized that financial markets are easily driven by human nature such as fear and greed, but there is a lack of technology or data to objectively and comprehensively quantify people's specific emotions. Sentiment analysis of social data opens a window to understand the spiritual world for investors who have been troubled by irrational actions in the financial market, and predicts...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/08G06F17/27
CPCG06N3/088G06F40/30G06F18/2111G06F18/214
Inventor 张建华刘鹏
Owner RUN TECH CO LTD BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products