Focusing relevancy ordering method for vertical search engine

A technology of vertical search engine and sorting method, which is applied in the field of focusing on relevance ranking, and can solve problems such as query subject drift and no relevance ranking scheme

Inactive Publication Date: 2010-07-07
DONGHUA UNIV
View PDF2 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] It can be seen that there is currently no universal and efficient correlation ranking scheme that can solve the problem of user query subject drift without increasing the amount of stored information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Focusing relevancy ordering method for vertical search engine
  • Focusing relevancy ordering method for vertical search engine
  • Focusing relevancy ordering method for vertical search engine

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. In addition, it should be understood that after reading the teachings of the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.

[0043] Embodiments of the present invention relate to a method for sorting focused relevance for a vertical search engine, comprising the following steps: (1) using a theme crawler to grab a webpage, store it in its URL queue, grab topic data, and do it for the search engine Data preparation; (2) Analyze the captured webpage links, establish a user behavior model by analyzing user click behavior, and derive the PageRank value t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a focusing relevancy ordering method for a vertical search engine. Aiming at the problem that a focused crawler cannot pass through a dark tunnel, the invention improves a focused crawling strategy of the focused crawler by using an on-line learning method and utilizing an auxiliary function, so as to lead the focused crawler to capture subject data with higher relevancy. A PageRank algorithm and an improved algorithm thereof are studied, the webpage clicking action of a user is modeled, and the transferring way of a PageRank value among links is improved, so as to put forward the improved algorithm. As to the disadvantage that the dimensionality of a feature extraction model of webpage weight is over high, a user-defined method of the webpage weight is put forward, so as to define a factor of the webpage weight and measure the weight of the factor of the webpage weight according to the divisibility criterion, thereby providing an evaluation function of the webpage weight and effectively lowering the dimensionality of a webpage feature space. By utilizing the method in the invention, the user can obtain a high-quality search result set when using a subject resource search engine system.

Description

technical field [0001] The invention relates to the technical field of computer network search engines, in particular to a method for sorting focused relevance for vertical search engines, that is, a search method based on web page relevance technology in search engine searches. Background technique [0002] With the increasing maturity of Internet-related technologies and the rapid growth of contained information, search engines have become the main means for people to retrieve Internet data. At present, the Internet already has 10 billion static web pages. Although traditional general search engines have comprehensive retrieval capabilities, they have defects such as large data redundancy and low query accuracy, which can no longer meet users' accuracy requirements for information retrieval. Topic-oriented and specialized vertical search engines are gradually occupying the market and attracting widespread attention. [0003] The purpose of a vertical search engine is to f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 温泉傅增明程裕强
Owner DONGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products