Unlock instant, AI-driven research and patent intelligence for your innovation.

Crawler interception method based on user behavior portrait, electronic equipment, and storage medium

A user and behavior technology, applied in the field of network security, can solve the problem of inefficient interception of web crawlers, and the common IP can be set arbitrarily without considering the effect of avoiding interception errors, reducing the interception error rate, and improving the accuracy rate.

Active Publication Date: 2018-11-09
ZHANGYUE TECH CO LTD
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method does not take into account the problems of normal users sharing IP and UA can be set arbitrarily, resulting in low efficiency of intercepting web crawlers

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Crawler interception method based on user behavior portrait, electronic equipment, and storage medium
  • Crawler interception method based on user behavior portrait, electronic equipment, and storage medium
  • Crawler interception method based on user behavior portrait, electronic equipment, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] figure 1 A flow chart showing a method for intercepting crawlers based on user behavior portraits according to Embodiment 1 of the present invention, as shown in figure 1 As shown, the crawler interception method based on user behavior portrait specifically includes the following steps:

[0030] Step S101, analyzing known crawler access requests to obtain user behavior portraits corresponding to known crawler access requests.

[0031] Based on the determined and known crawler access requests, user behavior data such as access traces left during the access process, operations on the page, and access to the server can be analyzed. For example, a large amount of user behavior data can be analyzed. User behavior portraits can be obtained through training, induction, and other methods. Among them, the user behavior profile includes data in multiple dimensions such as the frequency of user access to the server, the length of time spent on the page, the speed of page access,...

Embodiment 2

[0051] figure 2 shows a flow chart of a crawler interception method based on user behavior portraits according to Embodiment 2 of the present invention, as shown in figure 2 As shown, the crawler interception method based on user behavior portrait includes the following steps:

[0052] Step S201, analyzing known crawler access requests to obtain user behavior portraits corresponding to known crawler access requests.

[0053] For this step, refer to the description of step S101 in Embodiment 1, and details are not repeated here.

[0054] Step S202, receiving a page access request sent by the client.

[0055] Step S203, judging whether the originator of the access request is in the pre-established search engine white list.

[0056] Since some search engines also use crawler technology to access pages, the user behavior characteristics generated by them are very consistent with user behavior portraits, but these search engines are not objects that need to be intercepted, and...

Embodiment 3

[0069] Embodiment 3 of the present application provides a non-volatile computer storage medium. The computer storage medium stores at least one executable instruction. The computer executable instruction can execute the crawler interception method based on user behavior portrait in any of the above method embodiments. .

[0070] Specifically, the executable instruction can be used to make the processor perform the following operations:

[0071] Analyze known crawler access requests to obtain user behavior portraits corresponding to known crawler access requests; receive page access requests sent by clients, and obtain user behavior characteristics based on user behavior data generated by access requests; user behavior The feature is compared with the user behavior portrait of the crawler access request to determine whether the access request is a crawler access request; if so, the access request is intercepted.

[0072] In an optional implementation manner, the user behavior ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a crawler interception method based on a user behavior portrait, electronic equipment and a storage medium, and the method comprises the steps: analyzing a known crawler accessrequest to obtain the user behavior portrait corresponding to the known crawler access request; receiving a webpage access request sent by a client, and obtaining the user behavior characteristics according to the user behavior data generated by the access request; comparing the user behavior characteristics with the user behavior portrait of the crawler access request, and determining whether the access request is a crawler access request or not; and intercepting the access requests if the access request is the crawler access request. The method can achieve the accurate description of the feature points of the crawler access request through using the user behavior portrait obtained through the analysis of known crawler access request. According to the comparison of the user behavior characteristics of the access request sent by the client with the user behavior portrait, the method can improve the accuracy of the comparison and avoid interception errors. Further, the user manual verification is set during interception, so as to reduce the interception error rate.

Description

technical field [0001] The invention relates to the field of network security, in particular to a crawler interception method based on user behavior portraits, electronic equipment, and a storage medium. Background technique [0002] Web crawlers are a fundamental part of search engine technology. The web crawler grabs relevant information from the page by visiting the page, stores it in the server of the search engine, and provides search results to the user. When normal search engines use web crawlers, they generally indicate their identities to the server by using the UA (User-Agent, User Agent) field of the http request. By examining the server's logs, use the user-agent field to identify which crawlers have visited the server, and how often the crawlers visit. But some malicious web crawlers usually don't leave any user-agent field content, or they also masquerade their identities as normal search engines. These malicious web crawlers steal information from pages, po...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06G06F17/30
CPCH04L63/0281H04L63/1466
Inventor 杨磊朱金辉冯威
Owner ZHANGYUE TECH CO LTD