Differentiation webpage ranking method based on PageRank

A sorting method and differentiated technology, applied in the field of search engines, can solve problems such as unreliable sorting results, and achieve the effect of solving unreliable sorting results and improving performance

Inactive Publication Date: 2018-08-28
TIANJIN UNIV
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention provides a differentiated webpage sorting method based on PageRank. The present invention effectively solves the problem of unreliable sorting results caused by the PageRank algorithm due to the average distribution of link weights. See the following description for details:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Differentiation webpage ranking method based on PageRank
  • Differentiation webpage ranking method based on PageRank
  • Differentiation webpage ranking method based on PageRank

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] In order to achieve the above object, an embodiment of the present invention proposes a PageRank-based differential web page ranking algorithm (The Improved PageRank Algorithm Based on Web Page Differentiation, DPR). According to the different standards for evaluating the authority of web pages, it can be divided into two categories: Algorithm DPR-A, which evaluates the authority of web pages by the number of incoming links, and algorithm DPR-B, which evaluates the authority of web pages by the total number of links, see figure 1 , the DPR method includes the following steps:

[0031] 101: Use the PageRank algorithm to calculate the initial PR value of each node;

[0032] Wherein, the embodiment of the present invention regards each webpage as a node in the network.

[0033] 102: Based on the authority difference of the webpage, assign corresponding weight to it, and calculate and obtain a new ranking value calculation formula;

[0034] 103: Calculate the web page ra...

Embodiment 2

[0043] Combine below figure 1 , figure 2 , and specific calculation formulas, the scheme in embodiment 1 is verified for feasibility, see the following description for details:

[0044]201: In the process of calculating the initial PR value for each node, first initialize a same PR value for each webpage, and then perform iterative operations until the PR value of each webpage is stable. During each round of iteration, the webpage q The PR value of is averaged to the pages referenced by webpage q, as shown in formula (1):

[0045]

[0046] Among them, PR(p) represents the PR value of webpage p, OutDeg(q) represents the out-degree of webpage q, PR(q) represents the PR value of webpage q, N is the total number of webpages, and α is the damping factor, which is usually set to 0.85. It is set to solve the phenomenon that there is a link ring in the web page set (that is, there is a phenomenon of repeated web page nodes in the network link, which is a technical term well kno...

Embodiment 3

[0067] Combine below image 3 , Figure 4 The scheme in embodiment 1 and 2 is carried out test comparative analysis, measures the effectiveness of this method, and concrete steps are as follows:

[0068] Through the concepts and indicators often involved in search engines in the Internet, the improvement degree of the present invention is tested and analyzed.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a differentiation webpage ranking method based on PageRank. The method includes: using the PageRank algorithm to calculate the initial webpage ranking of each webpage; evaluating the authority allocation weight of each webpage according to the inlink number of the webpage to acquire a webpage ranking calculation formula; evaluating the authority allocation weight of each webpage according to the total link number of the webpage to obtain another webpage ranking calculation formula; using the webpage ranking calculation formulas to perform iteration calculation on the webpage ranking of each webpage until the webpage ranking of each webpage is stable; using detected garbage webpage number, recall rate, precision and F-Measure value to perform experimental comparisonanalysis to measure the differentiation webpage ranking method so as to achieve effectiveness of PageRank ranking. By the method, the problem that the ranking result of the PageRank algorithm is unreliable due to average link weight allocation is solved.

Description

technical field [0001] The invention relates to the field of data mining and search engines in the Internet, relates to search engine optimization technology, in particular to a method for sorting webpages which improves the disadvantages of PageRank evenly distributing link weights. Background technique [0002] At present, in related technologies of search engine web page ranking algorithms, the related algorithms of search engine technology mainly include: one is Hypertext Induced Topic Selection algorithm (Hypertext Induced Topic Selection, HITS). The HITS algorithm calculates the pivot value and authority value for the matching pages returned by the search keyword pair. The hub value refers to the sum of the authority values ​​of all outgoing links on the page, and the authority value refers to the sum of the hub values ​​of all incoming links. The corresponding web pages will be divided into hub web pages and authoritative web pages. The basic idea of ​​the HITS algo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9535
Inventor 刘春凤刘莹王建荣喻梅应翔滕玉宁
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products