Optimization method of distributed vertical crawler service system
A technology of service system and optimization method, applied in the optimization field of distributed vertical crawler service system, can solve problems such as vertical crawler service system not working normally, low performance of crawler logic unit, complicated webpage download and analysis logic, etc. The effect of difficulty, download efficiency improvement, and analysis efficiency improvement
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0046] The existing general distributed vertical crawler service system workflow is as follows: figure 1 shown. Crawler services run in a multi-thread or multi-process manner, and they may be deployed on one host or multiple hosts within the enterprise. Each crawler service obtains a download task from the task queue, then sends an HTTP request to the target URL address, and saves the returned result in memory, then uses the DOM analyzer to analyze the content of the HTML page, and then passes the DOM selector Select useful information, and finally save the useful information on the storage device.
[0047]The present invention optimizes the structure and flow of a general distributed vertical crawler, and the optimized flow is as follows figure 2 shown. In the present invention, the original crawler service system is divided into two parts: download service and page analysis logic, and both the download service and analysis logic are deployed on multiple cloud hosts, and ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com