Scrapy-based data crawling method, terminal equipment and computer readable storage medium

A data and data fetching technology, which is applied to computer-readable storage media and based on Scrapy in the data crawling field, can solve problems such as poor crawling effect of data, achieve the effect of improving the effect, improving writing efficiency, and reducing the number of loopholes

Pending Publication Date: 2019-08-20
ONE CONNECT SMART TECH CO LTD SHENZHEN
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In view of this, the present invention proposes a Scrapy-based data crawling method, terminal equipment, and computer-readable storage medium to solve the problem that the effect of crawling data is relatively poor during the use process based on the Scrapy framework

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scrapy-based data crawling method, terminal equipment and computer readable storage medium
  • Scrapy-based data crawling method, terminal equipment and computer readable storage medium
  • Scrapy-based data crawling method, terminal equipment and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0039] It should be noted that the descriptions involving "first", "second", etc. in the present invention are only for descriptive purposes, and should not be understood as indicating or implying their relative importance or implicitly indicating the number of indicated technical features . Thus, the features defined as "first" and "second" may explicitly or implicitly include at least one...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data crawling, and discloses a Scrapy-based data crawling method. The Scrapy-based data crawling method comprises the following steps of: defining configuration parameters of a crawler file based on a Scrapy framework in a JSON file of a JavaScript object; naming the JSON file, creating a crawler file, and naming the name of the crawler file according to the name of the JSON file; importing the configuration parameters of the JSON file into the crawler file; and running the crawler file into which the configuration parameters are imported, and crawling webpage data. The invention further provides terminal equipment and a computer readable storage medium. The Scrapy-based data crawling method, the terminal equipment and the computer readable storage medium can define configuration parameters of the Scrapy file through the JSON file, and the JSON file integrates the configuration file needed by one crawler file, so that the code writing efficiency is improved, and the number of vulnerabilities is reduced, and the webpage data crawling effect is improved.

Description

technical field [0001] The present invention relates to the technical field of data crawling, in particular to a Scrapy-based data crawling method, terminal equipment and a computer-readable storage medium. Background technique [0002] With the rapid development of the information society, there are more and more data on the Internet. In order to obtain useful information, web crawler technology is often used to crawl useful data. In the existing crawler technology, during the use of the crawler framework based on Scrapy, it is necessary to repeatedly write codes for crawling multiple websites; in the process of writing codes, in addition to analyzing the logic of the codes, it is also necessary to analyze the rules of the web pages, which will affect The correct rate of web page rules; in addition, the function switches and attention points of the Scrapy crawler framework are too scattered and distributed in files at various levels. In the process of using the Scrapy crawl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951
CPCG06F16/951
Inventor 董润华徐国强邱寒
Owner ONE CONNECT SMART TECH CO LTD SHENZHEN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products