A web page information perception collection method
A collection method and page information technology, which is applied in special data processing applications, website content management, instruments, etc., can solve the problems of difficult definition of crawling, inability to automatically identify post URLs, and fast page refresh rate of mainstream websites, etc., to achieve reduction The effect of customizing the workload and maintenance cost, overcoming the trouble of not being able to collect information, and avoiding the risk of information loss
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0032] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings.
[0033] Such as figure 1 As shown, the present invention designs a kind of WEB page information perception collection method in the actual application process, specifically comprises the following steps:
[0034] Step 001. From the entrance of the website to be collected, load page by page to obtain all link URLs on each page, filter out non-post information such as CSS, JS, pictures, audio or video, obtain the full URL of the website to be collected, and enter step 002;
[0035] Step 002. Judging whether there is a URL rule in the website to be collected and whether there is a full amount of URL records in the website to be collected at the same time, and according to the judgment result, enter step 003 and step 005 respectively for parallel processing, or enter step 004 and step 005 respectively 006 for parallel proces...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 
