A Chinese text sentiment analysis method based on deep learning

A technology of sentiment analysis and deep learning, applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc. There are too many model network parameters to achieve the effect of improving prediction accuracy, shortening training time, and simplifying model parameters

Active Publication Date: 2019-04-30
SICHUAN XW BANK CO LTD
View PDF6 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] For the problems of above-mentioned research, the purpose of the present invention is to provide a kind of Chinese text emotion analysis method based on deep learning, solve the weak point of the unsupervised emotion analysis method based on English in the prior art: (1) this method cannot be used in Chinese (2) The language model training in this method is time-consuming, the model network parameters are more, and the pre-training language model takes a lot of time; (3) This method only uses the last character of each sentence text as the sentence text The feature representation of the last character, and the word vector of the last character cannot represent the entire sentence text, which will reduce the accuracy of emotion classification for each sentence text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment example

[0061] Crawl news from the online loan home depository platform as corpus text (including text and punctuation marks, etc.), and the corpus text is 100,000;

[0062] Convert the Chinese characters in the 100,000 corpus texts to pinyin, and use characters as the granularity to remove low-frequency characters that appear less than 10 times in the corpus texts. After removing the low-frequency characters, remove the remaining characters and map them to digital indexes. , to obtain the mapping dictionary of characters and digital indexes, and then use the mapping dictionary to represent the corpus text in digital form, and obtain the preprocessed text data, the number of text characters is n;

[0063] With a step size of 1, the corpus text is divided from the beginning of the preprocessed text data, and the sequence length is 64 sentences. The first 63 characters of the divided sentence are the model input x, and the last character is the model output y. After dividing the corpus ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese text sentiment analysis method based on deep learning, and belongs to the technical field of natural language processing. The defects of an unsupervised sentiment analysis method based on English are overcome. The method comprises the following steps: after converting an obtained corpus text into pinyin, pre-training a constructed language model to obtain a pre-trained language model; obtaining a small amount of text data which is in the same field as the corpus text and has emotion categories, converting the text in the text data into pinyin, and training a constructed emotion classification model based on a pre-trained language model to obtain a trained emotion analysis model; and carrying out sentiment classification on the unlabeled text by utilizing the trained sentiment analysis model to obtain a corresponding sentiment category label. The method is used for Chinese text sentiment analysis.

Description

technical field [0001] A Chinese text sentiment analysis method based on deep learning is used for sentiment analysis of Chinese text, and belongs to the technical field of natural language processing. Background technique [0002] Text sentiment analysis refers to judging the emotional tendency of text. [0003] The language model is used to calculate the probability of a sentence and judge whether a sentence is reasonable. [0004] RNN is a recurrent neural network, a type of neural network for processing sequence data. [0005] LSTM is a long short-term memory network, a special type of RNN, which can learn long-term dependent information. [0006] GRU is a gated recurrent unit, a variant of LSTM, which simplifies the structure of the LSTM model. [0007] In the existing Chinese text sentiment analysis, most of the texts are classified using Chinese characters, and the accuracy of sentiment classification prediction is low through Chinese characters; while most of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F17/22
CPCG06F40/126
Inventor 朱玲张友书陈思成
Owner SICHUAN XW BANK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products