Method and device for expanding query, search engine system

An extended query and indexing technology, applied in the field of search query, can solve uncertain problems and achieve the effect of ensuring diversity and good retrieval effect

Active Publication Date: 2008-10-29
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF0 Cites 42 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The related query words obtained based on this technology have the following problems: the related query words provided are all the same i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for expanding query, search engine system
  • Method and device for expanding query, search engine system
  • Method and device for expanding query, search engine system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] refer to figure 1 , is a flow chart of the first embodiment of the method for expanding a query.

[0049] S101, counting words that co-occur with the query word.

[0050] Counting all the words that co-occur with the query word refers to counting which words a word appears in a webpage (or an article) at the same time. In practical applications, a preferred statistical method is: to build an index with all the query words that have appeared as keywords, and the index content is the words that appear together with the query words.

[0051] refer to figure 2 , is the index diagram. The index is an inverted index structure, each keyword in the index is a query word, and the index content corresponding to each keyword is the word that co-occurs with the query word. These co-occurring words may originate from multiple web pages. For example, for a certain query word, the co-occurring words are A, B, C, D, wherein words A and B appear simultaneously with the query word ...

Embodiment 2

[0076] refer to Figure 4 , is a flow chart of the second embodiment of the method for expanding a query. Wherein, S401-S404 are the same as S101-S104 in Embodiment 1, and will not be described in detail here.

[0077] S401, counting all words co-occurring with the query word;

[0078] In the search engine system, to accomplish this, a very large database is required. In the web search database, the entire database is a collection of all web pages that users can retrieve. To do this, the requirements for computing power are very large. To solve this problem, this embodiment adopts a distributed computing method, and distributes a computing task to a computer cluster for computing, thereby improving processing efficiency.

[0079] S402, classifying all co-occurring words;

[0080] S403, in each word class, select the most representative word and name it;

[0081] S404, using the most representative words of each category as related query words of the query word;

[0082] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an expansion query method, a device and a search engine system containing the device, which can solve the problems that the natures of relevant query words provided by the existing search engine are same possibly, so the searched results by using the query words are similar and whether more information in a wider range can be searched or not is uncertain. The method comprises following steps: statistics of co-occurrence words with the query words is carried out; all the co-occurrence words are classified; a characteristic word is selected for each type; the characteristic words of various types are taken as the relevant query words of the query words. Compared with the prior art, the invention provides the multi-type query for a user, the natures of various query words are different, thus being capable of querying the more information in the wider range. The expansion query method guides the user to use better words to carry out the retrieval, thus being capable of obtaining better retrieval effect; the quiddity of guiding the user is to carry out the speculation of the query purpose of the user and further carry out the division, thus obtaining better effect.

Description

technical field [0001] The invention relates to the field of search query, in particular to a method and device for expanding query and a search engine system including the device. Background technique [0002] The development of search engine technology has brought a lot of convenience to the vast number of network users. Users can easily obtain the information they want to know by using search engines. The user enters a query word on the search engine, and the search engine can return web pages containing the query word according to the user's query word. Therefore, for users who use search engines, query words are very important, and only by using appropriate query words can they find the desired webpage. [0003] At present, each search engine provides a "related search" function in order to help users find appropriate query terms and further improve the quality of search queries. That is, when a user queries a certain word, the search engine will prompt related query ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 张智敏
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products