Method of collecting Internet data
An Internet and data technology, applied in network data retrieval, network data indexing, electronic digital data processing, etc., can solve the problems of data duplication and poor matching degree of captured data, to meet user needs, avoid repeated capture, and apply Wide range of effects
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Embodiment Construction
[0021] The present invention will be further described below in conjunction with specific examples.
[0022] A method for collecting Internet data according to the present invention first collects data according to rules configured in advance by users, including web page download rules, web page analysis rules, and content extraction rules.
[0023] In the present invention, the process of Internet web page big data collection and processing mainly includes 4 aspects:
[0024] 1) Web crawlers. Crawl the page content from the network and extract the required data content from it.
[0025] 2) Data processing. Process the content extracted by the web crawler.
[0026] 3) Crawl the url queue. Provide the URL address of the website that needs to extract data for the web crawler.
[0027] 4) Data. The data includes three aspects: ① the url information of the data website that needs to be captured, ② the data extracted from the web page by the web, and ③ the data processed by t...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More