Unlock instant, AI-driven research and patent intelligence for your innovation.

Twitter sentiment classification text processing optimization system based on word embedding

A technology for text processing and sentiment classification, applied in the field of text processing to reduce negative impacts and optimize sentiment classification pipelines

Pending Publication Date: 2021-07-09
山西三友和智慧信息技术股份有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the above-mentioned problem that some colloquial key information is filtered, the present invention provides a system capable of alleviating the negative impact of traditional methods on emotions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Twitter sentiment classification text processing optimization system based on word embedding
  • Twitter sentiment classification text processing optimization system based on word embedding
  • Twitter sentiment classification text processing optimization system based on word embedding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0016] A text processing optimization system for Twitter sentiment classification based on word embedding, such as figure 1 As shown, it includes the following modules: text processing module, word embedding module, tweet embedding module, model training and verification module, each module is communicated and connected in parallel, and the text processing module uses traditional text processing to remove stop words The impact on tweet classification; the word...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of text processing, and particularly relates to a Twitter sentiment classification text processing optimization system based on word embedding, which comprises the following modules: a text processing module, a word embedding module, a Twitter embedding module and a model training and verification module. The text processing module uses a traditional text processing mode to eliminate the influence of stop words on tweet classification; the word embedding module is used for carrying out word embedding on a specific Twitter context by using a skip gram model; the Twitter embedding module is used for averaging m n-dimensional word vectors to obtain an n-dimensional tweet vector, and comparing the performances of three aggregation methods, namely the sum of the n-dimensional tweet vector, the weighted average of the n-dimensional tweet vector and k most important word vectors selected and added according to the importance of words; and the model training and verification module carries out model training and verification by using a 10*2 re-nested cross validation mode.

Description

technical field [0001] The invention belongs to the field of text processing, and in particular relates to a word embedding-based Twitter sentiment classification text processing optimization system. Background technique [0002] Current text processing steps are often performed using off-the-shelf routines and pre-built word dictionaries without optimization for the domain, application, and context. Some keywords containing Twitter colloquial words, emoji, and hashtags, which are usually removed because they are not available in traditional literary corpora, play a large role in Twitter sentiment classification. [0003] Reasons for problems or defects: At present, in the text processing stage of Twitter text sentiment analysis, because of the traditional method of using a pre-built word dictionary, some colloquial key information is filtered. Contents of the invention [0004] Aiming at the above-mentioned problem that some colloquial key information is filtered, the pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/247G06F40/30G06N3/04G06N3/08
CPCG06F16/35G06F40/247G06F40/30G06N3/08G06N3/045
Inventor 潘晓光令狐彬董虎弟李娟陈智娇
Owner 山西三友和智慧信息技术股份有限公司