Spam comment recognition method and system based on Bayesian algorithm and terminal

A technology of Bayesian algorithm and spam comments, which is applied in the identification method of spam comments based on Bayesian algorithm, system and terminal field, which can solve the problems that Bayesian algorithm cannot be directly identified, so as to improve user experience and reduce interference Effect

Inactive Publication Date: 2015-09-23
GUANGDONG OPPO MOBILE TELECOMM CORP LTD
View PDF5 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the Bayesian algorithm needs to use the existing spam content as the basis to judge whether the new comment content is normal, then it will face a problem, if th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spam comment recognition method and system based on Bayesian algorithm and terminal
  • Spam comment recognition method and system based on Bayesian algorithm and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] Such as figure 1 As shown, this embodiment discloses a method for identifying spam comments based on the Bayesian algorithm, and the steps are as follows:

[0034] Select a certain number of content that is determined to be normal comments and add them to the training set of the Bayesian algorithm for training; in this step, the number of normal comments selected to be input into the training set of the Bayesian algorithm for training is more than 100,000, generally 10 up to 500,000.

[0035] Use the content of the new comment as a keyword to search in the original comment database through the search engine;

[0036] Detect the similarity and quantity between the comments searched in the original comment database and the new comments. threshold, then the new comment and the comment whose similarity with the new comment reaches the preset first threshold is judged as a suspected spam comment, otherwise the new comment is judged as a normal comment; wherein the preset f...

Embodiment 2

[0042] Such as figure 2 As shown, this embodiment also discloses a spam comment identification system based on the Bayesian algorithm for realizing the above identification method, which includes

[0043] The acquisition module is used to obtain a certain amount of content that is determined to be normal comments, and then input it into the training set of the Bayesian algorithm for training;

[0044] The original comment database, the database used for all comment content, is the original database of the system;

[0045] The search engine module is used to use the content of the new comment as a keyword to search for the content of the comment in the original comment database;

[0046] The similarity detection module is used to detect the similarity between the comments in the original comment database searched by the search engine module and the new comments;

[0047] A quantity detection module is used to detect the quantity of comments whose similarity with the new comm...

Embodiment 3

[0052] This embodiment also discloses a terminal including the above-mentioned spam comment identification system based on the Bayesian algorithm. The terminal can be a mobile phone, a tablet computer and a computer.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a spam comment recognition method and system based on the Bayesian algorithm and a terminal. The spam comment recognition method includes the following steps that a certain amount of content determined as normal comments is input into a training set of the Bayesian algorithm to be trained, and the content of the new comments is used as key words and searched for in an original comment bank through a search engine; similarity between comments in the original comment bank and the new comments and the number of the similar comments are detected, and whether the new comments are suspected spam comments or not is determined according to the similarity and the number; the content of the new comments determined as the suspected spam comments is input into the Bayesian algorithm to be judged, and whether the new comments are the normal comments or not is determined. As the search engine and the Bayesian algorithm are combined, without previous spam comment content as reference, the spam comments can be intelligently excavated and recognized in a great number of comments. The phenomenon that the pure Bayesian algorithm needs to depend on the previous spam comment content and cannot intelligently recognize new variants of the spam comment content is avoided.

Description

technical field [0001] The invention relates to network security technology, in particular to a Bayesian algorithm-based spam comment identification method, system and terminal. Background technique [0002] In recent years, with the rapid development of the Internet, the way people express their opinions and communicate with each other has also changed. The Internet has become the main tool for people to acquire knowledge, communicate and publish information. With the development of interactive platforms such as commerce, the mining of information in comments has attracted more and more attention. After people watch videos, read blogs, Weibo or purchase goods on e-commerce platforms, they usually make corresponding comments and express their opinions. Opinions, for example, video comments can reflect the viewer’s feelings after watching the video, blog posts or Weibo comments can reflect readers’ views on blog posts and emotional expressions to the publisher, and product re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/951G06F40/279G06F40/30
Inventor 周德海
Owner GUANGDONG OPPO MOBILE TELECOMM CORP LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products