web crawler detection method based on access log IP analysis
A technology of web crawler and detection method, applied in the direction of network data retrieval, network data index, other database retrieval, etc., can solve the problems of false interception and normal user's false interception.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0037] The present invention will be further described in detail below in conjunction with the drawings.
[0038] The process of the present invention is as figure 1 As shown, specifically:
[0039] 1. Use the feature detection method to detect the features in the access request packet to determine whether it is a common crawler, and if the recognition is successful, determine that the IP belongs to a web crawler.
[0040] First obtain the UserAgent field in the access request, and check whether the UserAgent contains automated program features, including python, ruby, PhantomJS, pycurl, httpunit, Wget, and Java. If the above keyword features are detected, it is judged as a crawler.
[0041] Note: The above feature keywords are collected from the UserAgent of a common automated program. The tools that can initiate HTTP requests in the technical field are usually well-known to technical personnel, so it is not difficult to collect the features of these tools. If you encounter a new too...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com