Method and system of personalized junk information filtering in public-sentiment information

A spam and garbage technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as difficulty for users to customize a personalized garbage filtering mechanism, inability to quickly capture the variation characteristics of spam information, and slow model update. , to achieve real-time processing capabilities, improve filtering effects, and flexibly correct the effects

Inactive Publication Date: 2018-11-23
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The problems existing in the existing technology are the shortage of junk information filtering in the processing of large-scale public opinion information, long processing time, slow model update, unable to quickly capture the variation characteristics of spam information, and it is difficult for users to customize personalized Garbage filtering mechanism

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system of personalized junk information filtering in public-sentiment information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.

[0024] This embodiment provides a method for filtering personalized junk information in public opinion information and a system for implementing the method, such as figure 1 As shown, it is divided into the following five main stages:

[0025] 1. Data preparation: Data preparation refers to preprocessing the resources required in the system, including the following steps:

[0026] Step 1: Add general thesaurus and user personalized thesaurus. The general thesaurus data comes from the open data of the Internet and the initial settings of the system personnel, which are visible to all users and play a role in the prediction stage. User personalized thesaurus is set by a specific user, and is only visible and effective to the current user. These thesauruses i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a system of personalized junk information filtering in public-sentiment information. The steps of the method include: constructing a memory index library on the basis of a general word library and a user-personalized word library; carrying out word segmentation on an original document containing the public-sentiment information, and removing stop words; identifying the document after the above processing according to the memory index library, and obtaining junk information and non-junk information by identification; inputting the above non-junk information into an updateable information classification model to further identify junk information and non-junk information; and carrying out labeling of the junk information and the non-junk information on thenon-junk information identified by the information classification model on the basis of a general junk identification labeling corpus and a user-personalized junk identification labeling corpus to generate a training set, and using the same to update the information classification model.

Description

technical field [0001] The invention relates to the technical field of network information processing, in particular to a personalized garbage filtering method and system in Internet public opinion information. Background technique [0002] Internet public opinion information monitoring involves massive amounts of data information, and filtering of spam information plays an important role. First of all, spam filtering helps to obtain valid information and remove invalid information; secondly, filtering spam can reduce the pressure of system retrieval and reduce the size of data. [0003] The problems existing in the existing technology are the shortage of junk information filtering in the processing of large-scale public opinion information, long processing time, slow model update, unable to quickly capture the variation characteristics of spam information, and it is difficult for users to customize personalized Junk filtering mechanism. Contents of the invention [0004...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06K9/62
CPCG06F40/289G06F18/24155
Inventor 齐保元李鹏王斌周美林
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products