Unlock instant, AI-driven research and patent intelligence for your innovation.

Processing of query terms

A technology of query language and words, applied in the direction of electronic digital data processing, natural language data processing, special data processing applications, etc., to achieve the effect of increasing possibilities

Active Publication Date: 2010-12-22
GOOGLE LLC
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Both content authors and search engine users may not be able to conveniently generate their preferred characters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Processing of query terms
  • Processing of query terms
  • Processing of query terms

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] The synonyms map contains as keys common forms of words, each of which is associated with one or more variants. For example, consider a simple library in which only two languages ​​are found: French and English. If "elephant" is a normal form entry in the synonyms map, then if the variants "elephant", "éléphant", and "eléphant" are found in the library, those variants will be associated with the entry as values. Each value also includes additional information: the language of the document in which instances of the variant appear, and the number of times the variant occurs in that language. Continuing with the example, in the library, "eléphant" might be found 90 times in documents considered to be English, and 300 times in documents considered to be French.

[0025] Process 100 operates on a training base of documents (step 110). The training corpus of documents is ideally a collection of documents representing the documents contained in the search corpus. Alternativ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Methods, systems, and apparatus, including computer program products, to perform operations relating to processing query terms in a search query presented to a search engine. In one aspect, a method includes determining a query language from the query terms and the language of a user interface. In another aspect, a method includes using the interface language to select one or more mappings and using the mappings to simplify each query term; and applying each simplified query term to a synonyms map to identify possible synonyms with which to augment the search query. In another aspect, a synonyms map is generated from a corpus of documents. In another aspect, a method includes identifying one or more potential synonyms for a query term by looking up simplified query term in a synonyms map,the synonyms map mapping each of a plurality of keys to one or more variants, each variant being a word associated with one or more document languages.

Description

Background technique [0001] The present invention relates to handling linguistic uncertainty in processing search queries and in searching over repositories including documents and other searchable resources, where the queries and resources may be expressed in any of a number of different languages. [0002] Search engines index documents and provide methods to search documents whose content is indexed by the search engine. Documentation is written in many different languages; some documents have content in multiple languages. Various characters are used to represent words in these languages: Latin alphabet (i.e., 26 unaccented characters from A to Z, upper and lower case), diacritics (i.e., accented characters), ligatures (i.e., , β, ), Cyrillic characters, and others. [0003] Unfortunately, the ability and ease of generating these characters varies greatly from device to device. Neither the author of the content nor the user of the search engine may be able to conveni...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F7/00
CPCG06F17/2217G06F17/2795G06F17/2223G06F17/30684G06F16/3344G06F40/129G06F40/126G06F40/247
Inventor 鲁齐拉·S·达特法比奥·洛皮亚诺
Owner GOOGLE LLC