Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for identifying web crawler, storage medium and electronic equipment

A technology for identifying networks and crawlers, applied in the field of network information

Active Publication Date: 2019-09-17
BEIJING SANKUAI ONLINE TECH CO LTD
View PDF12 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method relies on the information collection of the front-end JavaScript of the webpage, and is difficult to apply on the APP (application program, Application) of the mobile terminal.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying web crawler, storage medium and electronic equipment
  • Method and device for identifying web crawler, storage medium and electronic equipment
  • Method and device for identifying web crawler, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] Specific embodiments of the present disclosure will be described in detail below in conjunction with the accompanying drawings. It should be understood that the specific embodiments described here are only used to illustrate and explain the present disclosure, and are not intended to limit the present disclosure.

[0067] figure 1 is a flow chart of a method for identifying web crawlers shown according to an exemplary embodiment, such as figure 1 As shown, the method includes:

[0068] S10, obtaining access data;

[0069] S20. Determine feature data of the access data, where the feature data includes data characterizing distribution features of access interfaces and / or data characterizing distribution features of access time;

[0070] S30. Determine, according to the feature data, that the access data is user data or crawler data.

[0071] In step S10, a piece of access data can be obtained based on a visit behavior of a visitor, the visitor can be a user or a crawl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method and a device for identifying a web crawler, a storage medium and electronic equipment. The method comprises the following steps: acquiring access data; determining characteristic data of the access data, the characteristic data comprising data used for representing distribution characteristics of an access interface and / or data used for representing distribution characteristics of access time; and determining that the access data is user data or crawler data according to the feature data. The method and the device are used for solving the technical problems that an anti-crawling effect based on an IP access frequency is poor and crawler recognition based on user behaviors of external interaction equipment is difficult to apply to an APP of a mobile terminal in the prior art.

Description

technical field [0001] Embodiments of the present disclosure relate to the field of network information technologies, and in particular, relate to a method, device, storage medium and electronic equipment for identifying web crawlers. Background technique [0002] A crawler is a program or script that automatically grabs information on the Internet according to certain rules. Crawlers can help staff quickly obtain a large amount of data on the network, but some malicious crawlers may violate user privacy, or increase the load on the server and affect its normal service, so it is necessary to take certain anti-crawler means to prevent malicious Use of reptiles. [0003] In related technologies, the following two methods are used to identify crawlers: [0004] One method is to identify crawlers based on the access frequency of IP (Internet Protocol Address). For different IPs, by counting the access frequency of each IP, when the access frequency is greater than the set thre...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/951
CPCG06F16/951
Inventor 肖圣龙武金刁士涵
Owner BEIJING SANKUAI ONLINE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products