Social network site false fan detection method achieved on basis of network crawler by means of machine learning

A web crawler and machine learning technology, applied in the field of data processing, can solve problems such as complex calculations, poor accuracy, and slow processing speed, and achieve high detection and recognition accuracy, improved detection accuracy, and less susceptible to interference

Inactive Publication Date: 2017-05-17
HUAZHONG UNIV OF SCI & TECH
View PDF3 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to solve the problem that there are a large number of false fans in the existing microblog or other social networks. The existing detection and identification methods have poor accuracy, large amount of calculation, complex calculation, slow processing speed, and isolated distribution points Or record wrong data is sensitive, and the data is easily disturbed during the calculation process. Now we provide a small calculation, fast processing speed, data is not easy to be disturbed during the calculation process, and the detection accuracy of false fans. A high-level method for detecting fake followers on social networking sites based on web crawlers and machine learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Social network site false fan detection method achieved on basis of network crawler by means of machine learning
  • Social network site false fan detection method achieved on basis of network crawler by means of machine learning
  • Social network site false fan detection method achieved on basis of network crawler by means of machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0028] see Figure 1 to Figure 4 , the false follower detection method of the social networking site based on web crawler and machine learning of the present invention, comprises the following steps:

[0029] a. First, use the crawler framework to build a web crawler that can automatically obtain user data on Weibo or other social networks, and define the corresponding item field to store the desired structural data;

[0030] b. Then the web crawler automatically obtains Weibo or other social network data, and extracts the selected feature value. The web crawler starts from an initial URL, obtains the data that needs to be extracted from the web page, and then extracts a new URL to enter the next round Crawl until the stop requirement is met;

[0031] c. Select feature fields from the extracted data, obtain training samples and test...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a social network site false fan detection method achieved on the basis of a network crawler by means of machine learning. The method comprises the steps that data of microblog or other social network users is automatically acquired through the web crawler, and a simulation login function is achieved; characteristic fields are selected from the data extracted by the web crawler, and a training sample and a test sample are obtained; a classical SVM algorithm classifier is adopted, multiple groups of data are extracted from the training sample randomly and guided into the classifier, and the classifier conducts machine learning to form a training classification model; the classification model is tested through the test sample, and the optimal cross validation precision is achieved by continuously adjusting setting parameters of the classification model; the microblog or other social network users are detected through the optimal classification model. Accordingly, the false fan detection precision is greatly improved, the computing amount is low, the processing speed is high, the data is not likely to be interfered in the computing process, and the method is particularly suitable for mass data processing.

Description

technical field [0001] The invention relates to a method for detecting false fans, in particular to a method for detecting false fans of social networking sites based on web crawlers and machine learning, and belongs to the technical field of data processing. Background technique [0002] People all over the world now rely on online social networks (OSNs) to share knowledge, opinions and experiences, seek information and resources, and expand personal connections, but in social networking sites, users' actions are not necessarily true. Due to the large number of users using online social networks, social networking sites have also become platforms that are exploited and profited in various forms while providing value to ordinary users. For example: a large amount of user information on social networking sites is what advertisers and fraudsters hope to obtain; some people who want to increase the social participation of their accounts will use robots to like or repost; Popul...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06Q50/00G06K9/62
CPCG06F16/951G06F16/955G06Q50/01G06F18/2411
Inventor 王一博袁巍李佳桓李珩
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products