Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automatic studying and judging method for emotional tendency of Internet information

A technology of emotional tendency and Internet information, which is applied in the field of automatic research and judgment for the emotional tendency of Internet information, can solve the problems of insufficient generalization effect, poor performance, and low accuracy of the research and judgment model, and achieve strong model generalization ability , improve model performance, and have good robustness

Pending Publication Date: 2021-10-22
西安康奈网络科技有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide an automatic research and judgment method for the emotional tendency of Internet information, which solves the problems of low accuracy rate, insufficient generalization effect of research and judgment models, and complex Chinese contexts such as obscurity and ambiguity in traditional public opinion and emotion research and judgment. poor performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic studying and judging method for emotional tendency of Internet information
  • Automatic studying and judging method for emotional tendency of Internet information
  • Automatic studying and judging method for emotional tendency of Internet information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0043] Preprocessing the public opinion corpus data set, the preprocessing steps are:

[0044] Collect the public opinion data with emotional tendency in the public opinion corpus data set, and perform data cleaning on the data;

[0045] Format public opinion data;

[0046] Use Chinese character dictionary files to convert public opinion data on demand;

[0047] Perform multi-process preprocessing on public opinion data.

Embodiment 2

[0049] refer to figure 2 , Differences between pre-training model architectures: BERT uses a bidirectional Transformer. OpenAI GPT uses Transformers from left to right. ELMo uses a concatenation of independently trained left-to-right and right-to-left LSTMs to generate features for downstream tasks. Among the three representations, only the BERT representation is the condition of the joint representation to have both left and right contexts on all layers. Apart from the architectural differences, BERT and OpenAI GPT are fine-tuning methods, while ELMo is a feature-based method.

[0050] In the present invention, the pre-training of the public opinion corpus data set is based on deep learning, and a mixed-precision, multi-machine and multi-GPU training mode is used in the training process. This application uses 160G training public opinion corpus, uses the RoBERTa model to train for one week, batch size is 64, a total of 3 machines each with 8 GPUs (NVIDIA Tesla V100 16G) f...

Embodiment 3

[0057] Such as figure 1 and image 3 , fine-tuning the learning rate parameter of the pre-trained model on the downstream task dataset is 3e-4, the batch size parameter is 64, the epochs parameter is 12, and the mask type is set to fully_visible.

[0058] The RoBERTa model fully pre-trains and fine-tunes BERT. Except for the output layer, the same architecture is used in both pre-training and fine-tuning. Use the same pre-trained model parameters to initialize models for different downstream tasks. During fine-tuning, all parameters are fine-tuned. [CLS] is a special symbol that is added before each input example, and [SEP] is a special delimiter token used to separate questions / answers.

[0059] The prediction task of the final model of this application is as follows:

[0060] Input=[CLS]the man went to[MASK]store[SEP]he bought a gallon [MASK]milk[SEP]

[0061] Label=IsNext

[0062] Input=[CLS]the man[MASK]to the store[SEP]penguin[MASK]are flight ##less birds[SEP]

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an automatic studying and judging method for emotional tendency of Internet information, relates to the technical field of language emotion analysis. According to the method, a method for pre-training by using a RoBERTa model on general corpora and finely adjusting downstream tasks is adopted; in a deep learning training process, a mixed precision and multi-machine multi-GPU training mode is used; and after super parameter searching training is completed, a model is deployed and an interface is provided to complete the work of automatic studying and judging. The method solves the problems that in traditional public opinion emotion studying and judging work, the accuracy is not high, the studying and judging model generalization effect is not good enough, and the performance is not good enough when coping with complex Chinese contexts such as obscure and ambiguity.

Description

technical field [0001] The invention relates to the technical field of language sentiment analysis, in particular to an automatic research and judgment method for the sentiment tendency of Internet information. Background technique [0002] According to the 47th "Statistical Report on China's Internet Development" released by China Internet Network Information Center (CNNIC), as of December 20, 2020, the number of Internet users in China reached 989 million. Therefore, the Internet collects and provides us with a large amount of data information, among which the analysis of Internet public opinion is an essential step to deal with the analysis of Internet public opinion. [0003] With the continuous and in-depth development of the Internet era, Internet public opinion sentiment analysis has become an indispensable means to understand social conditions and public opinions, grasp public opinion trends, and respond and deal with emergencies quickly. The automatic research and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/36G06F16/953G06F40/211G06F40/253G06F40/30
CPCG06F16/3344G06F16/374G06F16/953G06F40/211G06F40/253G06F40/30
Inventor 郭齐
Owner 西安康奈网络科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products