Method for extracting important time slices in social media short texts

A time segment and social media technology, applied in text database query, unstructured text data retrieval, text database clustering/classification, etc., can solve the problems of short length, manpower and financial resources, and small scope of application

Active Publication Date: 2021-01-05
TIANJIN UNIV
View PDF7 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, no matter which of the above three methods, there are certain limitations
For the first method, it may be more effective in traditional long-form press releases, but in short texts, due to its short length, there are often no similar keywords
Moreover, short texts on social media can be published by ordinary people, and generally do not pay attention to wording, and there may be some dialects or even language problems, so obviously the scope of application is very small; for the second method, short texts It gradually appeared with the development of the event. For example, a hot topic occurred on Twitter. It did not attract much attention at the beginning, and the number of related tweets was small, but with the development of the event, the number of tweets gradually increased.
Each tweet can only reflect information at a certain point in time, and usually only discusses a certain topic in the event, and it is impossible to see the whole picture of the event
Therefore, it is not feasible to divide according to paragraphs; for the third method, although the amount of each text is small, the number of short texts is often large. The crowdsourcing method consumes a lot of manpower and financial resources, and it is not a suitable method for promotion.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting important time slices in social media short texts
  • Method for extracting important time slices in social media short texts
  • Method for extracting important time slices in social media short texts

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The technical solution of the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments, and the described specific embodiments are only for explaining the present invention, and are not intended to limit the present invention.

[0059] The present invention proposes a method for extracting important time segments in short texts of social media (for example, within 140 characters of Weibo, and within 280 characters of Twitter), which mainly includes the following steps:

[0060] Step 1. According to every two hours as a time period, divide the social media short text in time;

[0061] Step 2. For the text divided in step 1, determine the important time segment from the perspective of topic evolution, that is, extract the sequence of keywords in the short text of social media through the dynamic topic model, and take the top 20 keywords of each topic. ; Then look for the monotonous intervals where t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for extracting important time slices in a social media short text. The method comprises the following steps: dividing a text in time; extracting a subject term sequence in the social media short text through a dynamic subject model, searching a monotonous interval with popularity ranking change of each subject term, and combining monotonous intervals which have opposite trends and belong to fluctuation or monotonous intervals which have the same trend and smaller change amplitude; taking intersections of the combined monotonous interval sequences of all the subject terms in sequence, calculating the chaos degree of each intersection, and ranking to obtain a plurality of important time slices determined from a subject evolution perspective; performing sentiment analysis on each text after time period division by utilizing a naive Bayesian classifier, and determining an important time slice union set of each sentiment through a sentiment change amplitudeand a threshold value; calculating the confusion degree in the union set, and ranking to obtain a plurality of important time slices determined from the perspective of emotion conversion; and taking an intersection of the important time slices determined from the two angles to obtain the time slice.

Description

technical field [0001] The invention relates to a method for determining important time periods in short texts of social media. The method can be applied to short texts of social media such as Weibo and Twitter, and belongs to a key research problem in the field of storytelling of short texts. Background technique [0002] Identifying and determining important fragments in the development of events is a necessary prerequisite for promoting the research process of short text storytelling, and it is also a big challenge. It is of great significance to the development of text visualization, and text visualization is an important field of data visualization. Short text is a text form different from traditional long-form news. It has typical characteristics, such as colloquialism, short text, and complex "social relations". It generally appears in social media, such as Twitter and Weibo. Wait. The "social relationship" here refers to interactive actions such as forwarding, likes...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/33G06F16/35G06F16/9536G06K9/62
CPCG06F16/3329G06F16/3344G06F16/35G06F16/9536G06F18/24155
Inventor 席德伟张怡
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products