Junk comment identification method based on collaborative training

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A spam comment and collaborative training technology, which is applied in the field of spam comment identification based on collaborative training, can solve problems such as spam comments, achieve the effects of reducing workload, learning models efficiently, and improving accuracy

Inactive Publication Date: 2017-06-13

GUANGXI NORMAL UNIV

View PDF8 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The technical problem to be solved by the present invention is that there are a large number of spam comments in the existing social network, and a method for identifying spam comments based on collaborative training is provided

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0025] Take the spam comment in microblog as example below, the present invention is described in further detail:

[0026] The overall framework diagram of a spam comment identification method based on collaborative training figure 1 shown.

[0027] Due to the limitation of 140 characters in microblog and its comments, the text content is short, but the comment data is huge and various network words are emerging in an endless stream. This invention designs a microblog spam comment identification method, using Co-Training collaborative training algorithm , construct two classifiers, AdaBoost and SVM, classify and train two classifiers on 10% of the labeled training data, and then use 70% of a large amount of unlabeled data as an additional set for collaborative training of the classifier, and finally use 20% of the labeled data is used as the test set. While improving the classification accuracy, it saves a lot of sample labeling work.

[0028] (1) Experimental data acquisit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a junk comment identification method based on collaborative training. Junk comments are classified into explicit junk comments and implicit junk comments, the explicit junk comments are screened out by adopting a rule-based method, identification training is conducted on one of the implicit junk comments by adopting two classifiers AdaBoost and the SVM based on an automatic identification method, and finally whether the comment is a junk comment or not is further judged through Co-Training. Therefore, the classification accuracy is improved, and meanwhile the classification efficiency of a junk comment classification method is also ensured.

Description

technical field [0001] The invention relates to the technical field of computer machine learning, in particular to a method for identifying spam comments based on collaborative training. Background technique [0002] Machine learning (Machine Learning, ML) is a multi-field interdisciplinary subject, specializing in the study of how computers simulate or implement human learning behaviors to acquire new knowledge or skills, and reorganize existing knowledge structures to continuously improve their performance. Data mining is one of the theoretical foundations for machine learning. Data mining refers to extracting information hidden in it and unknown to people from a large number of incomplete, noisy, fuzzy, and random actual data. However, it is a process of potentially useful information and knowledge, and review-oriented data mining has always attracted the attention of researchers. [0003] A social network is a social relationship network service built on a network platf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/27G06F17/30

CPCG06F16/35G06F40/289

Inventor 李志欣兰丹媚张灿龙

Owner GUANGXI NORMAL UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Junk comment identification method based on collaborative training

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology