Query translation method facing multi-lingual information retrieval system

An information retrieval, multilingual technology, applied in the field of query translation, which can solve the problem of low translation accuracy

Inactive Publication Date: 2010-06-30
HARBIN INST OF TECH
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem of low translation accuracy in current translation methods, and to provide a query translation method for multilingual information retrieval systems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Query translation method facing multi-lingual information retrieval system
  • Query translation method facing multi-lingual information retrieval system
  • Query translation method facing multi-lingual information retrieval system

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0012] Specific implementation mode one: combine figure 1 Describe this specific embodiment, the multilingual information retrieval system-oriented query translation method of this embodiment, the multilingual information retrieval system consists of a preprocessing unit 1, a translation word selection unit 2 based on semantic relationship, and a semantic attenuation-based co-occurrence The translated word selection unit 3 and the translated word selection unit 4 based on bilingual attenuation co-occurrence are composed of a bilingual dictionary 5 built in the preprocessing unit 1, and the preprocessing unit 1 outputs data to the translated word selection unit 2 based on the semantic relationship, based on The translated word selection unit 2 of the semantic relationship outputs data to the translated word selection unit 3 based on the semantic attenuation co-occurrence, and the translated word selection unit 3 based on the semantic attenuation co-occurrence outputs data to the...

specific Embodiment approach 2

[0019] Specific embodiment 2: This embodiment is a further limitation of the multilingual information retrieval system-oriented query translation method described in Embodiment 1. The semantic relationship-based translation word selection unit 2 described in step 2 calculates each The specific process of the weight value of a candidate translation word is:

[0020] Extract the sememes and sememe relations of each candidate translation word received; then, among all the candidate words received, calculate the intersection of the sememe relations of every two candidate translation words belonging to different keywords, and when When the intersection is not empty, vote and score the two candidate translated words corresponding to the intersection, obtain the score of each candidate translated word, and use the score as the weight of the corresponding candidate translated word.

specific Embodiment approach 3

[0021] Specific embodiment three: this embodiment is a further limitation of the multilingual information retrieval system-oriented query translation method described in embodiment two, the two candidate translated words corresponding to the intersection are voted and scored, and each The score of the candidate translated words is obtained by the following formula:

[0022]

[0023] Among them, W ij Indicates the score of the jth candidate translation word of the ith keyword among the M keywords, R ij Represents the set of all sememe relations of the j-th candidate translation word of the i-th keyword, R mn Represents the set of all sememe relations of the nth candidate translation word of the mth keyword, |R mn ⌒R ij |Represents the set R ij and R mn The number of intersections, m and n are loop variables, both are natural numbers, m represents the number of keywords, n represents the number of candidate words corresponding to each keyword, N m is the total number of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a query translation method facing a multi-lingual information retrieval system, relates to the query translation method and solves the problem of low translation accuracy of the existing translation methods. The multi-lingual information retrieval system consists of a pretreatment unit, a translated word selection unit based on semantic relations, a translated word selection unit based on semantic attenuation coexistence and a translated word selection unit based on bilingual attenuation coexistence. A plurality of key words and candidate translated words of a sentence to be translated are obtained, then the candidate translated words are screened by respectively and successively utilizing the translated word selection unit based on semantic relations, the translated word selection unit based on semantic attenuation coexistence and the translated word selection unit based on bilingual attenuation coexistence, and the final screening result is taken as the translation result. The method overcomes the defects of the prior art and can be applied to machine translation and related fields of natural language processing.

Description

technical field [0001] The invention relates to a query translation method. Background technique [0002] With the continuous explosive growth of various types of information on the Internet, the language used to write information on the Internet is becoming increasingly internationalized, and people have put forward higher requirements for information retrieval, that is, they are no longer satisfied with searching documents in the same language. Instead, it is required to include multilingual information in the search results. It is becoming more and more common for users to query a multilingual document set. In order to obtain more, more comprehensive and more accurate information, and to overcome language barriers, people hope to be able to use the language they are most familiar with. (such as Chinese) to describe user queries, while presenting document sets written in other languages ​​(such as English) in the retrieval results, that is, to perform multi-lingual inform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/28
Inventor 郑德权朱红垒
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products