Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Soft glance microblog multi-emotion dictionary expanding method based on semanteme

A technology for emotional dictionaries and emotional words, applied in semantic analysis, special data processing applications, instruments, etc., can solve problems such as time-consuming and laborious, weak pertinence of emotional dictionaries, inability to solve text irregularities, and rich semantic expressions

Active Publication Date: 2018-02-13
GOONIE INT SOFTWARE BEIJING
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The quality of emotional dictionaries directly affects the final effect of classification, and many emotional dictionaries are weakly targeted and the number of emotional words cannot meet the classification requirements
Using artificially annotated sentiment dictionaries is not only time-consuming and laborious, but also unable to solve the problems of text irregularity and semantic expression richness in massive microblogs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Soft glance microblog multi-emotion dictionary expanding method based on semanteme
  • Soft glance microblog multi-emotion dictionary expanding method based on semanteme
  • Soft glance microblog multi-emotion dictionary expanding method based on semanteme

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The specific embodiments of the present invention will be further described in detail below in conjunction with the drawings and embodiments. The following examples are used to illustrate the present invention, but not to limit the scope of the present invention.

[0050] according to figure 1 As shown, the method proposed by the present invention is implemented in the following steps (take Sina Weibo as an example):

[0051] Step (1) Weibo corpus acquisition and preprocessing

[0052] Use the API provided by Sina Weibo to download the Weibo corpus in json format, and extract the text information published by the user to obtain the Weibo corpus, denoted as G 1 .

[0053] Corpus G 1 It performs traditional and simplified conversion to obtain corpus G 2 ; Use the currently developed ICTCLAS word segmentation system to compare corpus G 2 Perform word segmentation and part-of-speech tagging, and filter the corpus after word segmentation, and only keep Chinese characters, part-of-s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a soft glance microblog multi-emotion dictionary expanding method based on semanteme. The soft glance microblog multi-emotion dictionary expanding method based on semanteme comprises the following steps: establishing a candidate seed dictionary; filtering candidate seed emotion words by word frequency weighting and entropy weighting; acquiring candidate emotion words by word 2vec algorithm, and verifying by a statistical approach; and supplementing an emotion dictionary by a rule-based method. By the method, the multi-emotion dictionary is expanded effectively, and theproblem of imbalance of the number of the emotion words in the multi-emotion dictionary is solved.

Description

Technical field [0001] The invention belongs to the field of text information processing, and specifically relates to a method for expanding a weakly-supervised microblog multi-emotion dictionary based on semantics. Background technique [0002] Weibo is a global user information sharing platform. Users can share and disseminate information by publishing texts or pictures. In recent years, Weibo websites have developed rapidly. Chinese Weibo is represented by "Sina" Weibo and "Tencent" Weibo, and English Weibo is represented by "Twitter" and "Facebook". The development of Weibo has accelerated the spread of information. But with the convenience of obtaining information, the efficiency of people's obtaining knowledge from massive data is also decreasing. [0003] The classification of traditional texts can no longer meet people's requirements for the classification of online instant information. How to automatically judge the emotions that people want to express based on the cont...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/374G06F40/30
Inventor 刘磊孙孟涛贾亚璐陈浩
Owner GOONIE INT SOFTWARE BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products