Unlock instant, AI-driven research and patent intelligence for your innovation.

Event detection-oriented short text data filtering method for social networks

A technology for social network and event detection, applied in digital data processing, special data processing applications, natural language data processing, etc., can solve the problems of sparse features, high noise, short text length, etc., and achieve the effect of effective data input

Inactive Publication Date: 2018-12-21
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Compared with traditional long text data, social network short text data has the characteristics of large data volume, high noise, low signal-to-noise ratio, irregular expression, and short text length. Therefore, the traditional bag-of-words model that relies on text word frequency information is not applicable. And there will be problems of feature sparsity and dimension disaster
For the above problems, the classification of existing short text data in social networks mainly focuses on the classification research based on semantic features and structural features, but the former needs to rely on large corpus, while the feature selection method of the latter is simple and single, and the selected features can be Scalability and portability are poor, and neither has achieved good results
And the second type of method does not take into account the social network environment where the short text data is located, and does not consider the background characteristics of the text publisher, the statistical syntactic features of the text, and the subsequent impact information of the text on the social network. The beneficial impact of short text data classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Event detection-oriented short text data filtering method for social networks
  • Event detection-oriented short text data filtering method for social networks
  • Event detection-oriented short text data filtering method for social networks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0029] Such as figure 1 As shown, it is a schematic flow chart of the event detection-oriented social network short text data filtering method of the present invention. A method for filtering social network short text data oriented to event detection, comprising the following steps:

[0030] A. Obtain short text data of social network and preprocess the short text data of social network;

[0031] B. Extract user background features, text syntax features and text influence features from the social network short text data processed in step A;

[0032] C. Train the GBDT classifier to classify the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an event detection oriented social network short text data filtering method, which comprises the following steps of: preprocessing the social network short text data, extracting features of the social network short text data, and training a GBDT classifier to classify the social network short text data. A method and apparatus are disclosed based on the background characteristic of a user, the syntactic features of the text and the influence features of the text are analyzed, then 20-dimensional classification features are extracted. Finally, GBDT algorithm is used to classify the short text data, filter the data classified as useless information, and retain the potentially useful value information data, so as to provide effective data input for event detection.

Description

technical field [0001] The invention belongs to the technical field of natural language processing, and in particular relates to an event detection-oriented social network short text data filtering method. Background technique [0002] With the development of technologies such as web2.0, social network and mobile Internet, the trend of explosive growth of information is becoming more and more obvious, and the traditional way of information exchange has been greatly impacted. For the current mainstream social media platforms, such as Twitter, Facebook, etc., users can discuss topics of interest and share real-time news anytime and anywhere through the above-mentioned platforms. Due to the huge number of social media users, the simple way to publish information, and the fast speed of information dissemination, social networks contain a wealth of information. However, while social networks bring abundant information, the explosive information also makes it difficult to effecti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F40/211G06F40/289
Inventor 费高雷赵越于娟娟
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA