Correlation fraction distribution-based method for classifying query intentions

A correlation score and query intent technology, applied in digital data processing, special data processing applications, instruments, etc., can solve problems such as insufficient query click logs, element intent classification, etc.

Inactive Publication Date: 2012-04-11
PEKING UNIV
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0031] The technical problem to be solved by the present invention is: the query click log of the long-tail distribution is insufficient, and the problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Correlation fraction distribution-based method for classifying query intentions
  • Correlation fraction distribution-based method for classifying query intentions
  • Correlation fraction distribution-based method for classifying query intentions

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] A method for classifying query intentions based on the distribution of relevance scores proposed by the present invention will be described in detail below with reference to the drawings and embodiments.

[0055] Such as figure 1 As shown, the method of the embodiment of the present invention includes the following steps:

[0056] S1. Acquiring the search result and the web page of the query;

[0057] S2. Constructing a retrieval result set according to the retrieval results and the webpage;

[0058] S3. Measuring the relevance scores of the documents in the retrieval result set;

[0059] S4. Classify the query intent by using the distribution of the relevance score.

[0060] Each step of S2-S4 is introduced in detail below.

[0061] S2. Construct the retrieval result set

[0062] In order to build a result set (a collection), it is first necessary to crawl the search results (which can be pages) to obtain the first n results returned by the search engine. If the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of network and information search, and discloses a correlation fraction distribution-based method for classifying query intentions, which comprises the following steps of: S1, obtaining a queried search result and a webpage; S2, constructing a search result set according to the search result and the webpage; S3, measuring a correlation fraction of a document in the search result set; and S4, classifying the query intentions by using the correlation fraction distribution. By adopting an improved Hits algorithm, an improved PageRank algorithm and an improved search model, the correlation fraction of the search result is obtained, the problem of insufficient queried and clicked logs of long-tailed distribution in the traditional scheme is solved, and the problem of incapability of finding matched anchor text sets or fewer elements in the matched anchor text sets in anchor texts in the anchor text-based method is also solved.

Description

technical field [0001] The invention relates to the technical field of network and information retrieval, in particular to a method for classifying query intentions based on correlation score distribution. Background technique [0002] With the development and popularization of network and information retrieval technology, search engines have played an increasingly important role in users' daily online activities, so analyzing the user needs behind them when using search engines has gradually become an important task in the field of search engines. research direction. Existing studies have found that users will choose different search results for different information needs. If the search engine can infer the user's information needs, then on this basis, it can provide users with search results that are more in line with their requirements, thereby improving user satisfaction. [0003] Query intent is defined as the information needs behind the query. Users have various in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 闫宏飞刘晓兵徐谷子何靖李铄
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products