Method for automatically identifying web crawler
An automatic identification and crawler technology, applied in the field of web crawlers, to prevent the collection of information
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0017] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. By embedding javascript in the web page to redirect to the same page one or more times and return the status code at the same time, the crawler cannot crawl the page normally due to deduplication. Execute the cookie or badcookie specified by the javascript code in onload to identify whether the request comes from a crawler.
[0018] The server home page returns a page containing only JS code (the code of the script file extension written in JavaScript). This code is located in the onload function and is executed after the page is fully loaded. This JS code will use a certain algorithm (IP, header and other information as algorithm parameters) to set a cookie field, and then use window.location to jump to the home page (this page). If the server detects that the cookie is legal, it returns another piece of JS, which uses another algorit...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 
