A Keyword-Based Oriented Web Page Acquisition Method
A collection method and keyword technology, applied in electrical digital data processing, digital data information retrieval, instruments, etc., can solve the problems of low accuracy and efficiency of classifiers, accurate crawling of difficult subject web pages, frequent communication, etc. The overall collection accuracy, improving the data collection rate, and improving the effect of global searchability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0033] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings.
[0034]The keyword-based directional web page collection method designed by the present invention, (1) to solve the collection accuracy rate in the subject collection is not high, this paper by proposing a data directional collection method, with historical collection data as a reference, dynamically formulate suitable Threshold, adjust the system acquisition model in time, so as to achieve good and fast capture. And it can improve the global searchability to a certain extent, avoid the collection of web pages falling into a local optimal state, and improve the overall collection accuracy of the system through adaptive algorithms. (2) Based on the distributed platform, this paper optimizes the distributed configuration environment, and uses the Nutch open source crawler framework to realize the distributed multi-threaded ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


