Method and device for mining key pages of a website

A focus and page technology, applied in the field of data mining processing, can solve problems such as poor accuracy and low recall rate, and achieve high accuracy

Active Publication Date: 2018-10-16
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But this state-of-the-art method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for mining key pages of a website
  • Method and device for mining key pages of a website
  • Method and device for mining key pages of a website

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0026] Please refer to figure 1 , figure 1 It is a schematic flow chart of the method for mining key pages of the website in the present invention. Such as figure 1 As shown, the method includes:

[0027] Step S101: extracting navigation link strings from each webpage of the website respectively.

[0028] Step S102: Separate the extracted navigation link strings into link pairs.

[0029] Step S103: Determine important link pairs from each link pair, and use the page corresponding to the important link pair as an important page of the website.

[0030] The above steps are described in detail below.

[0031] Please refer to figure 2 , figure 2 It is a schematic diagram of the navigation link string in the present invention. Such as figu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for mining key pages of a website. The method includes: extracting a navigation link list from each page of the website, splitting each extracted navigation link list into link pairs, wherein each link pair is composed of two links at adjacent positions in the navigation link list; determining key link pairs from all the link pairs, and taking the pages corresponding to the key link pairs as the key pages in the website. By the means, recall rate and accuracy rate when the key pages of the website are mined can be increased.

Description

【Technical field】 [0001] The invention relates to data mining processing technology, in particular to a method and device for mining key pages of a website. 【Background technique】 [0002] Web page authority is an important reference factor for search engines to rank results. When calculating the authority of a webpage, all the webpages participating in the calculation are regarded as a set, and the authority of the webpage is iteratively calculated through the link relationship between the webpages in the set. However, with the development of the Internet, there are more and more webpages on the Internet. If all webpages on the Internet are used as webpages participating in authoritative calculations, the requirements for the architecture of the computing system are very high, so usually only the websites and external websites are selected. Web pages with links are used as web pages participating in the authoritative calculation, but this method of the existing technology ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 张冲
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products