Method and apparatus for acquiring data

A technology for obtaining data and web page data, which is applied in the computer field, can solve problems such as the inability to perform priority scheduling, and achieve the effect of achieving priority scheduling and improving flexibility

Active Publication Date: 2019-07-12
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current data acquisition method usually acquires the web page data corresponding to each seed in sequence according to the preset scheduling order, and cannot perform priority scheduling

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for acquiring data
  • Method and apparatus for acquiring data
  • Method and apparatus for acquiring data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The application will be further described in detail below with reference to the drawings and embodiments. It can be understood that the specific embodiments described here are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for ease of description, only the parts related to the relevant invention are shown in the drawings.

[0034] It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict. Hereinafter, the present application will be described in detail with reference to the drawings and in conjunction with embodiments.

[0035] figure 1 An exemplary system architecture 100 to which the method for acquiring data or the apparatus for acquiring data of the present application can be applied is shown.

[0036] Such as figure 1 As shown, the system architecture 100 may include a terminal device 101, a network 102 and serve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data acquisition method and device. On specific execution mode of the method comprises the steps that target seed information is sequentially selected froma seed information set; for each piece of target seed information selected sequentially, the target seed information is included into a priority target seed information queue or a conventional targetseed information queue based on the judgment of whether the target seed information carries a priority identifier used for instructing priority processing; and links are sequentially extracted from the target seed information in the priority target seed information queue and the target seed information in the conventional target seed information queue, and webpage data corresponding to the extracted links is acquired. Through the execution mode, data acquisition flexibility is improved.

Description

Technical field [0001] The embodiments of the present application relate to the field of computer technology, in particular to the field of Internet technology, and in particular to methods and devices for acquiring data. Background technique [0002] With the development of computer technology, in order to perform better data analysis, it is usually necessary to grab data from web pages through web crawlers. Web crawlers are also called Scalable Web Crawlers, web spiders, etc. Web crawlers usually start to obtain web page data from a set of URLs (Uniform Resource Locator) links to be visited, which can be called seeds. [0003] Current data acquisition methods usually acquire webpage data corresponding to each seed in sequence according to a preset scheduling sequence, and priority scheduling cannot be performed. Summary of the invention [0004] The embodiments of the present application propose methods and devices for obtaining data. [0005] In a first aspect, an embodiment of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/9535
CPCG06F16/9535
Inventor 陈坤斌方军郑志彬莫洋王万梁
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products