Unlock instant, AI-driven research and patent intelligence for your innovation.

Webpage crawling method and apparatus

A web page and web crawler technology, applied in the Internet field, can solve problems such as low efficiency, and achieve the effect of reducing possibility, improving efficiency, and improving efficiency effect.

Active Publication Date: 2017-05-10
BEIJING GRIDSUM TECH CO LTD
View PDF7 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The main purpose of this application is to provide a web crawling method and device to solve the problem of low efficiency in the related art when web crawlers crawling keyword search engine result pages through a single server

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Webpage crawling method and apparatus
  • Webpage crawling method and apparatus
  • Webpage crawling method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0022] In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0023] It should be noted that the terms "first" and "second...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a webpage crawling method and apparatus. The method comprises the steps that a plurality of servers obtain keyword sets from a task queue, wherein the task queue stores a plurality of to-be-crawled keyword sets, and each to-be-crawled keyword set contains a plurality of keywords; and the servers crawl search engine result pages corresponding to all the keywords in the obtained keyword sets through respective network crawlers. According to the method and the apparatus, the technical problem of relatively low efficiency of crawling keyword search engine result pages through a network crawler of a single server in related technologies is solved.

Description

technical field [0001] The present application relates to the field of the Internet, and in particular, relates to a webpage crawling method and device. Background technique [0002] In traditional search engine optimization (Search Engine Optimization, referred to as SEO) business, it is usually necessary to help users analyze the ranking of keywords in search engines. Usually, users will preset a set of keywords, and regularly use web crawlers to crawl the rankings of these keywords in search engines, that is, crawl the search engine result pages corresponding to keywords through web crawlers. The engine result page refers to the search result page displayed after entering keywords in a search engine (eg, Baidu, Sogou, etc.). [0003] However, in order to prevent robots (for example, web crawlers) from accessing or reduce abnormal access traffic, search engines often limit the search speed or number of searches of a single IP address (ie, anti-crawler strategy), and usual...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 何熠皓
Owner BEIJING GRIDSUM TECH CO LTD