Unlock instant, AI-driven research and patent intelligence for your innovation.

Index page main body link recording method and apparatus

A technology of link records and index pages, applied in the Internet field, can solve problems such as reducing spider collection, record coverage, traffic waste, and missing links.

Inactive Publication Date: 2016-01-20
BEIJING QIHOO TECH CO LTD +1
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this method is: when the scheduling cycle interval is short, there is generally no problem of missing links (that is, missing links are not collected), but there may be waste of traffic; when the scheduling cycle interval is long, there may be leakage chain
[0005] The case of missing links will reduce the coverage of Spider collection and records

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Index page main body link recording method and apparatus
  • Index page main body link recording method and apparatus
  • Index page main body link recording method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0035] Such asfigure 1 As shown, an embodiment of the present invention provides an index page body link recording method, which includes:

[0036] Step 110, obtain one or more subject links from the index page in reverse order of publishing time. In the technical solution of this embodiment, the index page means that the main part of the webpage (the main part is usually the centered area on the webpage) is a link rather than pure...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides an index page main body link recording method and apparatus. The method comprises: according to a reverse order of release time, acquiring one or more main body links from an index page; determining whether an intersection exists between the one or more main body links and recorded history main body links; and when no intersection exists between the one or more main body links and the recorded history main body links, recording the one or more main body links, and updating the one or more main body links in an iteration mode until the intersection exists between the one or more main body links and the recorded history main body links. According to the method disclosed by the present invention, the main body links are acquired and recorded, the condition of link leakage cannot be generated, and the condition of repeated acquisition also cannot be generated.

Description

technical field [0001] The present invention relates to the technical field of the Internet, in particular, to a method and device for recording links to an index page body. Background technique [0002] Spider (spider, crawler) is located at the most upstream of the search engine data flow, responsible for collecting resources on the Internet to the local area and providing them for subsequent retrieval. It is one of the most important data sources of search engines. The goal of the spider system is to discover and crawl all valuable web pages on the Internet. To achieve this goal, the first thing is to find links to valuable web pages. Currently, spiders have multiple scheduling mechanisms to discover resource links as quickly and completely as possible: [0003] (1) Scheduling the excavated seed webpages according to a certain period (scheduling means grabbing links on the seed webpages, etc., for example, scheduling 20 times a day), so as to cover all time-sensitive webp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 郑燕琴
Owner BEIJING QIHOO TECH CO LTD