Systems and methods for interactive search query refinement

a search query and interactive technology, applied in the field of search engines, can solve the problems of difficult cognitive task for text search engine users, broad search queries cannot satisfy the more specific information desires of many different search engine users, and cannot guarantee that the related terms generated actually reflect the subject matter or vocabulary used within the corpus of documents, and achieve the effect of less i/o resources

Inactive Publication Date: 2006-01-12
JOLLIFY MANAGEMENT
View PDF46 Cites 77 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0021] The present invention provides an improved method for refining a search query that is designed to retrieve documents from a document index. The present invention is advantageous because it does not rely on cross document data structures or global statistics that must be recomputed each time the corpus is updated. Further, the present invention requires significantly less I / O resources at query time (run time) because fewer results need to be fetched at run time than in known methods to produce a short list of relevant suggestions that includes a mix of phrases, single word terms, and specializations (phrases including a query term). In the present invention, each document in the document index is processed at some time prior to the query, for example during the generation of the document index. In this processing, each document in the document index is examined to determine if the document includes any terms suitable for inclusion in a set of ranked candidate terms for the document. When the document includes such terms, the document index entry for the document is configured to include a set of terms associated with the document. This set of terms is called a set of ranked candidate terms.

Problems solved by technology

Developing search expressions that both convey a user's information need and match the way that need is expressed within the vocabulary of target documents has long been recognized as a difficult cognitive task for users of text search engines.
A large majority of search engine users begin their search for a document with a query having only one or two words, and are then disappointed when they do not find the document or documents they want within the first ten or so results produced by the search engine.
While user satisfaction can be improved, at least for some searches, by improving the manner in which results are ranked, very broad search queries cannot satisfy the more specific information desires of many different search engine users.
A weakness of such approaches is that there is no guarantee that the related terms so generated actually reflect the subject matter or vocabulary used within the corpus of documents itself.
While many of these approaches are functional, they are somewhat unsatisfactory for very large web search engines, either for reasons of runtime performance or relevance of feedback terms generated.
As noted in Vélez et al., this approach is unsatisfactory because it is an expensive run time technique.
In other words, it will take an unsatisfactory amount of time to compute the set of term suggestions S using DM in cases where the document database (corpus) is large.
While this approach improves runtime performance by precomputing a subset of term relationships off-line, the Vélez et al. approach has drawbacks.
First, there is a context problem.
However, this is assumption is not always true.
Because of the underlying assumption in Vélez et al., the approach can potentially lead to inappropriate search term suggestions in some instances or else miss other suggestions that would be more relevant within the context of the entire query.
Accordingly, the computational demands of Xu and Croft are unsatisfactory for very large, dynamic document databases.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Systems and methods for interactive search query refinement
  • Systems and methods for interactive search query refinement
  • Systems and methods for interactive search query refinement

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In a typical embodiment, the present invention generates, in an efficient manner, a small set (10-20) of query refinement suggestions (subset of candidate terms) that are potentially highly relevant to a user's query and reflect the vocabulary of target documents.

[0051] As shown in FIG. 1, a search query is submitted by a client computer 100 to a search engine server 110. Upon receiving the search query, search engine server 110 identifies documents in document index 120 that are relevant to the search query. Further, search engine server 110 ranks the relevant documents by, for example, their relevance to the search query among other ranking factors. A description of this group of ranked documents (search results) is then returned to client computer 100 as a group of ranked documents. In the present invention, additional information, in the form of a subset of candidate terms (search refinement suggestions), is returned to the client computer along with the initial group of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A received query is processed so as to generate an initial group of ranked documents corresponding to the received query. Each document in all or a portion of the documents in the initial group of ranked documents is associated with a respective set of ranked candidate terms such that each candidate term in the respective set of ranked candidate terms is embedded within the document. Each respective set of ranked candidate terms is identified at a time prior to the processing of the received query. In accordance with a selection function, a subset of the candidate terms in one or more of the respective sets of candidate terms is selected. In response to the received query, the initial group of ranked documents and the subset of candidate terms are presented.

Description

[0001] This application claims priority to U.S. Patent Application Ser. No. 60 / 456,905 entitled “Systems and Methods For Interactive Search Query Refinement” filed Mar. 21, 2003, attorney docket number 10130-044-888, which is hereby incorporated by reference in its entirety.FIELD OF THE INVENTION [0002] The present invention relates to the field of search engines, such as search engines for locating documents in a database or documents stored on servers coupled to the Internet or in an intranet, and in particular the present invention relates to systems and methods for assisting search engine users in refining their search queries so as to locate documents of interest to the users. BACKGROUND OF THE INVENTION [0003] Developing search expressions that both convey a user's information need and match the way that need is expressed within the vocabulary of target documents has long been recognized as a difficult cognitive task for users of text search engines. A large majority of search...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30
CPCG06F17/30646Y10S707/99935Y10S707/99943G06F16/3325G06F7/00
Inventor ANICK, PETER G.GOURLAY, ALASTAIRTHRALL, JOHN
Owner JOLLIFY MANAGEMENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products