Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for identifying spam

A spam information and information identification technology, applied in the field of information processing, can solve the problems of inaccurate identification results and human resource consumption, and achieve the effects of improving timeliness, improving grasping ability, and reducing accuracy

Active Publication Date: 2020-10-16
阿里巴巴(中国)网络技术有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] The embodiment of the present application provides a spam identification method and equipment to solve the problems of inaccurate identification results and large human resource consumption in existing spam identification technologies.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying spam
  • Method and device for identifying spam
  • Method and device for identifying spam

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0034] Embodiment 1 of the present application provides a spam identification method, specifically, as figure 1 As shown, it is a flow chart of the steps of the method described in Embodiment 1 of the present application, and the method may include the following steps:

[0035] Step 101: Determine the training sample set, the information category to which each training sample in the training sample set belongs, and the basic feature data of each training sample.

[0036] It should be noted that in machine learning, the data composition of the training sample set is very important, and the distribution of positive and negative samples should be as close as possible to the data distribution of the real environment in order to make the recognition model more robust in the real environment. Stickiness and higher accuracy. Therefore, in the training sample set, the ratio of the number of spam training samples to the number of non-spam training samples can usually be within a set r...

Embodiment 2

[0108] Based on the same inventive concept, Embodiment 2 of the present application provides an information identification device, specifically, as image 3 As shown, it is a schematic structural diagram of the device described in Embodiment 2 of the present application, and the device may include:

[0109] A sample determination unit 301, configured to determine the training sample set, the information category to which each training sample in the training sample set belongs, and the basic feature data of each training sample;

[0110] A model learning unit 302, configured to train an information recognition model for identifying spam according to the information category to which each training sample belongs and the basic feature data of each training sample;

[0111] The spam identification unit 303 is configured to classify each piece of information to be identified based on the obtained information identification model, and determine whether each piece of information to b...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a junk information identification method and equipment. Junk information which is historically identified and junk information related to junk information which is reported in a latest time period are automatically updated to a training sample set so as to continuously update an information identification model obtained by training, and an integral information identification system based on a closed loop data stream is formed; on one hand, the capture ability of the information identification model for variant junk information can be improved, and on the other hand, the identification ability of the information identification model for junk information of an old form can be prevented from lowering, and an effect on improving information identification timeliness and accuracy and lightening human cost can be achieved.

Description

technical field [0001] The present application relates to the technical field of information processing, and in particular to a spam identification method and device. Background technique [0002] With the promotion and popularization of the network, the amount of network information is increasing. There must be some illegal information (also called junk information) in a large amount of network information, which not only wastes network resources, but also pollutes the network environment and brings many troubles to users. [0003] For example, in the field of e-commerce, there are more and more unscrupulous users to promote three-no products by posting advertisement evaluation information (that is, spam information), or speculate and cheat to mislead consumers; this behavior not only pollutes the evaluation system , and may even seriously damage the interests of consumers. Therefore, in order to ensure the health and legality of network information, the automatic identif...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/38G06F16/36
CPCG06F16/35G06F16/36G06F16/38
Inventor 肖谦赵争超林君潘林林张一昌
Owner 阿里巴巴(中国)网络技术有限公司