Universal low-code crawler method and system for news blog website
A technology for websites and blogs, applied in general low-code crawler methods and systems, can solve problems such as high development and learning costs, high debugging difficulty, multiple memory consumption, etc., to improve development and maintenance efficiency, have versatility, and improve crawling efficiency effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0063] This embodiment proposes a general low-code crawler system for news blog websites developed based on Node.js, the architecture is as follows figure 2 As shown, it includes configuration loading module, page resource loading module, data extraction module, data storage module, asynchronous multi-task management module and log and progress management module; among them,
[0064] Such as image 3 As shown, the page resource loading module includes a URL intelligent splicing processing module, a dynamic page loading module based on puppeteer and a static page loading module based on axios;
[0065] Such as Figure 4 As shown, the data extraction module includes a selector type identification module, a data object generation module, a data extraction module based on css and a data extraction module based on xpath;
[0066] Such as Figure 5 As shown, the data storage module includes a data object verification module, a json file storage module, a csv file storage module,...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


