Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for analyzing search characteristics of pages

An analysis method and technology of an analysis device, applied in the field of search, can solve the problems of low efficiency, insufficient new chain discovery ability, and high popular pages

Active Publication Date: 2020-06-12
ALIBABA (CHINA) CO LTD
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this algorithm also has some disadvantages: first, the rank of popular pages is often higher than that of long-cold pages, which is not conducive to mining pages that meet users’ long-cold needs; second, the rank of old pages will be higher than that of new pages, because Even really good new pages won't have many upstream links, so it's not good for new page discovery
[0014] The advantage of the HITS algorithm is that it can better describe the organizational characteristics of the Internet. However, the HITS algorithm also has some shortcomings, such as low efficiency. The HITS algorithm is an algorithm related to queries, so it must be calculated in real time after receiving user queries. In addition, the problem of insufficient ability to mine long cold links and discover new links also exists

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for analyzing search characteristics of pages
  • Method and device for analyzing search characteristics of pages
  • Method and device for analyzing search characteristics of pages

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0058] As mentioned above, the existing page value analysis schemes either only start from the characteristics of the page itself, and determine the value of the page by analyzing the link relationship between pages, which is not conducive to mining pages and new pages that meet the unpopular needs of users; The calculation is performed after receiving the user query, and the calculation efficiency is low. Aiming at the shortcomings of exis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a page search characteristic analyzing method and device. The analyzing method comprises calculating the first similarity between historical search requests concentrated in search and pages concentrated in page; viewing the historical search requests as well as the pages with the first similarity exceeding a first preset threshold value as the mutually matched ones; according to matching information of the pages, analyzing the pages to determine the search characteristics of the pages. Therefore, all steps in the page search characteristic analyzing method can be implemented offline, the search characteristics of the pages are determined on the basis of the matching information of the pages and the historical search requests, and further, compared with existing page analyzing schemes, the determined page search characteristics can better meet the search intention of users and help dig out pages meeting less popular demands as well as new pages of the users.

Description

technical field [0001] The invention relates to the field of search technology, in particular to a method and device for analyzing search characteristics of pages. Background technique [0002] The existing commercial search engines basically adopt the figure 1 The overall architecture shown is to regularly crawl web pages on the Internet through crawlers, complete feature calculation and index construction of web pages through offline analysis, and finally provide retrieval services for users by the online retrieval system. However, it is estimated that the Chinese Internet alone currently has about 100 trillion web pages, and about 10 billion new web pages are added every day. Such a huge scale poses a huge challenge to crawling, storage, indexing, and retrieval. [0003] At present, the main solution is to select a subset that is considered to have "value" from the complete set of web pages for priority processing. The current relatively well-known web page value analysi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/953G06F40/289G06F40/30
CPCG06F16/951G06F40/289G06F40/30
Inventor 尹文科徐健刘高强闫彬
Owner ALIBABA (CHINA) CO LTD