Unlock instant, AI-driven research and patent intelligence for your innovation.

Webpage data capturing method and device, storage medium and electronic device

A web page data, web page technology, applied in the computer field, can solve the problem of low efficiency of crawling

Pending Publication Date: 2019-12-31
TENCENT TECH (BEIJING) CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Embodiments of the present invention provide a method, device, storage medium, and electronic device for capturing webpage data, so as to at least solve the technical problem of low capture efficiency when capturing webpage data in the related art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage data capturing method and device, storage medium and electronic device
  • Webpage data capturing method and device, storage medium and electronic device
  • Webpage data capturing method and device, storage medium and electronic device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0019] It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage data capturing method and device, a storage medium and an electronic device. The method comprises: obtaining candidate web page links matched with web page link information in a pre-configured web page link information set from web page links on a target website, wherein the web page links on the target website comprise web page links corresponding to a homepage of the target website and web page links corresponding to all levels of web pages under the homepage; searching a target webpage of which the webpage type is a target type in candidate webpages pointedby the candidate webpage link, the target type being used for indicating that a crawling rule of the target webpage is configured in a pre-configured webpage link information set; and crawling webpage data from the target webpage according to the crawling rule. According to the method and the apparatus, the technical problem of relatively low crawling efficiency when the webpage data is crawled in related technologies is solved.

Description

technical field [0001] The present invention relates to the field of computers, in particular to a web page data capture method, device, storage medium and electronic device. Background technique [0002] In order to be able to understand the development of the industry more quickly and comprehensively, sometimes it is necessary to crawl the content of some websites. The current crawling method is to start from a link and continuously expand new links and then crawl, and then expand new links. Repeatedly, but this way There are many problems, for example, the crawling area is relatively blind, a lot of useless content may be caught, the site resources may not be completely captured, many links are not captured, the capture is incomplete, and it is difficult to judge whether to crawl complete. [0003] For the above problems, no effective solution has been proposed yet. Contents of the invention [0004] Embodiments of the present invention provide a method, device, stora...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/951
Inventor 汤见乐
Owner TENCENT TECH (BEIJING) CO LTD