Unlock instant, AI-driven research and patent intelligence for your innovation.

Website data capturing method, device and equipment and medium thereof

A data capture and data technology, applied in the field of communication, can solve problems such as limiting information acquisition, and achieve the effects of avoiding pipeline blockage, improving crawler efficiency, and high coupling

Pending Publication Date: 2019-12-10
上海媒科锐奇网络科技有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, general websites have anti-crawler mechanisms to limit the acquisition of such information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Website data capturing method, device and equipment and medium thereof
  • Website data capturing method, device and equipment and medium thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In the following description, numerous technical details are set forth in order to provide the reader with a better understanding of the present application. However, those of ordinary skill in the art can understand that even without these technical details and various changes and modifications based on the following embodiments, the technical solutions claimed in the claims of the present application can be realized.

[0053] In order to make the objectives, technical solutions and advantages of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

[0054] The first embodiment of the present invention relates to a method for scraping website data. figure 1 It is a schematic flow chart of the data capture method of the website.

[0055] Specifically, as figure 1 As shown, the website data scraping method includes the following steps:

[0056] Step 101 , judging ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of communication, and discloses a website data capturing method, device and equipment and a medium thereof. The website data capturing method comprises the steps ofjudging whether the use time of a current IP address used for capturing target website data exceeds a preset time threshold value or not; if the judgment result is that the preset time threshold valueis exceeded, judging whether the capture of the target website data meets a capture stop condition or not; if the judgment result is that the grabbing stopping condition is not met, stopping grabbingthe target website data by the current IP address, and obtaining the IP address which is not used currently from the IP address list; accessing the target website by adopting the obtained IP addresswhich is not used currently so as to capture target website data; and if the judgment result is that the capture stopping condition is met, stopping accessing the target website. According to the invention, the IP address can be effectively prevented from being determined as the crawler IP address and prohibited from acquiring the webpage information.

Description

technical field [0001] The present invention relates to the field of communications, and in particular, to a method, device, device and medium for grabbing data from a website. Background technique [0002] With the rapid development of online shopping, more and more shopping websites have appeared. In order to comprehensively analyze the commodities on these shopping websites, such as price comparison, it is necessary to use web crawlers to obtain information from these shopping websites. However, general websites have anti-crawling mechanisms to limit the acquisition of such information. SUMMARY OF THE INVENTION [0003] The purpose of the present invention is to provide a web site data capture method, device, device and medium thereof, which can effectively prevent the IP address from being determined as the crawler IP address and being prohibited from obtaining web page information. [0004] In order to solve the above technical problems, the embodiments of the presen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951G06F16/955
Inventor 包喆元
Owner 上海媒科锐奇网络科技有限公司