A method capable of significantly improving the network information capturing and storage speed
A technology for network information and storage speed, applied in the field of big data, can solve problems such as time-consuming, cluttered content, and complex network information capture methods, and achieve the effect of improving collection and capture speed and storage speed.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Embodiment Construction
[0019] A method that dramatically increases the speed at which web information is captured and stored, including:
[0020] Step 1, grab the required information from the Internet through a web crawler, and extract the required keywords;
[0021] Step 2: Provide the crawler with URLs that need to crawl the data network through the URL queue; the URL is only a part of all seed URLs, put these URLs into the URL queue to be crawled, and take out the URL to be crawled from the URL queue to be crawled URL, resolve DNS, and get the IP of the host, download the webpage corresponding to the URL, store it in the downloaded webpage library, put the URL of the downloaded webpage into the crawled URL queue, and analyze the URLs in the crawled queue ;
[0022] Step 3, process the content captured by the crawler through the data classification module;
[0023] Step 4, store the URL information of the website that needs to grab data, the data extracted from the webpage by the crawler, and t...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com