Text message processing method and device

An information processing method and technology for an information processing device, which are applied in electrical digital data processing, special data processing applications, natural language data processing, etc., can solve the problems of inability to obtain emotional attributes of words to be analyzed, and high requirements for updating speed of sample words. Achieve the effect of avoiding poor results, reducing update requirements, and improving accuracy

Inactive Publication Date: 2017-03-29
四川无声信息技术有限公司
View PDF8 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing methods rely heavily on these sample words. If the words to be analyzed cannot be found in the sample words, the emotional

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text message processing method and device
  • Text message processing method and device
  • Text message processing method and device

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0031] figure 2A flow chart of a text information processing method provided by the first embodiment of the present invention is shown. see figure 2 , the text information processing method includes:

[0032] Step S110, acquiring text information;

[0033] In this embodiment, the text information is mainly Chinese short text, which can be input through the input and output unit 160, or can also be obtained through the network. Of course, the text information may also be text in other languages, for example, it may also be English text.

[0034] Step S120, performing word segmentation processing on the text information to obtain a plurality of undetermined words;

[0035] When the text information is Chinese text, unlike the English text where a space is used as a natural delimiter between two adjacent words, there is no obvious delimiter between adjacent words in the Chinese text. Therefore, Chinese word segmentation processing is required for the text information. Chin...

no. 2 example

[0066] Figure 5 A flow chart of a text information processing method provided by the second embodiment of the present invention is shown. see Figure 5 , the text information processing method includes:

[0067] Step S210, acquiring corpus;

[0068] In this embodiment, Sohu news data (SogouCS) provided by Sogou Lab, and microblog corpus or forum comments captured by crawler technology can be used as training corpus.

[0069] Step S220, using the word2vec algorithm to train the corpus to obtain a plurality of training words in the corresponding table and word vectors corresponding to each of the training words;

[0070] Word2vec can represent each word as a Distributed Representation word vector form, and the similarity in the vector space can be used to represent the semantic similarity of words. Word2vec is an improvement based on the Neural Network Language Model (NNLM) based on a three-layer neural network, and proposes two log-linear models: continuous-bag- of-words, C...

no. 3 example

[0084] Figure 9 A block diagram of functional modules of a text information processing device provided by the third embodiment of the present invention is shown. The text information processing apparatus provided in this embodiment can run in the computer 100, and is used to implement the text information processing method provided in the first embodiment. see Figure 9 , The text information processing device 10 provided in this embodiment includes: a text information acquisition module 11 , a word segmentation module 12 , a word vector acquisition module 13 , a similarity calculation module 14 and an emotion attribute determination module 15 .

[0085] Wherein, the text information acquisition module 11 is used to acquire text information;

[0086] The word segmentation module 12 is used to carry out word segmentation processing to the text information to obtain a plurality of undetermined words;

[0087] The word vector acquisition module 13 is used to obtain the word v...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text message processing method and device, and belongs to the technical field of natural language processing and data mining. The method comprises the following steps: acquiring text message; carrying out participle processing on the text message to obtain a plurality of undetermined terms; acquiring term vectors which respectively correspond to the undetermined terms; calculating the similarity of the term vector corresponding to each undetermined term and the term vector corresponding to each emotion term in a preset emotion dictionary; and judging the emotion attribute of the text message according to the similarity of the term vector corresponding to each undetermined term and the term vector corresponding to each emotion term in the emotion dictionary. Compared with an existing method, the text message processing method and device reduces requirements on the updating speed of the emotion dictionary; the problem of poor emotion analysis effect caused by the reason that the emotion dictionary is not updated in time is avoided; and accuracy of analysis results is improved effectively.

Description

technical field [0001] The present invention relates to the technical field of natural language processing and data mining, in particular to a text information processing method and device. Background technique [0002] Sentiment analysis of text information is the process of analyzing, processing, summarizing, and inferring emotionally subjective texts. It is widely used in network public opinion analysis and early warning, business decision-making, etc. Traditional sentiment analysis methods are mainly based on the sentiment attributes of sample words. For example, the sample words include a large number of emotional words, and the emotional attributes of the words to be analyzed are determined by searching the sample words. However, the existing methods rely heavily on these sample words. If the words to be analyzed cannot be found in the sample words, the emotional attributes of the words to be analyzed cannot be obtained. That is to say, the existing methods require a ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/289G06F40/30
Inventor 黄勇卢康张磊宋国志崔凯铜
Owner 四川无声信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products