Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval method and system

A weighted association and cross-lingual technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as inferior single-language retrieval performance, query topic drift, word mismatch, etc.

Inactive Publication Date: 2017-02-01
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Scholars from all over the world have conducted in-depth discussions and research on cross-lingual information retrieval methods and systems from different angles and directions, and have achieved rich results. However, the current problems in cross-lingual information retrieval research have not been completely resolved. One of the problems that have been solved and paid more attention to is the serious query topic drift problem in the process of cross-language information retrieval, which is more serious than single-language retrieval. The problem of word mismatch, these problems often lead to low performance of cross-language retrieval, Not as good as monolingual retrieval performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval method and system
  • Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval method and system
  • Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The technical solution of the present invention will be described in further non-limiting detail below in conjunction with the embodiments and accompanying drawings.

[0065]One, in order to better illustrate the technical scheme of the present invention, the relevant concepts involved in the present invention are introduced as follows below:

[0066] suppose CLIRdoc = {d 1 , d 2 ,...,d n} is the target language preliminary inspection related document set of the cross-language preliminary inspection result, where, d i (1≦i≦n) is the i-th document in the target language document set CLIRdoc, d i ={t 1 ,t 2 ,...,t m ,...,t p},t m (m=1,2,...,p) is called the feature-term item of the target language (Feature-term Item, FTI), referred to as the feature item, generally composed of words, words or phrases, d i The corresponding feature item weight set W in i ={w i1 ,w i2 ,...,w im ,...,w ip},w im for the i-th document d i The mth feature item t in m The corres...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval method and an Inter-word weighting associating mode-based Vietnamese-to-English cross-language text retrieval system. The method comprises the following steps: translating a Vietnamese user query into an English query by utilizing a machine translating module, and submitting the English query to a text retrieval module for retrieving an English document; performing relevance judgment by utilizing a user relevant feedback information extracting module to obtain a user feedback English relevant document set; performing pre-processing by utilizing an English document pre-processing module to obtain an initial retrieval English relevant document library; establishing an English characteristic word weighting associating rule library by using a weighting associating mode mining module; establishing an English extension word library by utilizing an extension work generating module; resubmitting a combined new query to the text retrieval module for retrieving to obtain a final retrieval result English document by utilizing a query extension implementation module, translating the final retrieval result English document into a Vietnamese document through a final result display module, and returning the Vietnamese document to a user. The method and the system can effectively enhance and improve the cross-langue retrieval performance, and have a good practical application value and a good popularization prospect.

Description

technical field [0001] The invention belongs to the field of text information retrieval, in particular to a Vietnamese-English cross-language text retrieval method and system based on inter-word weighted correlation mode, which is applicable to fields such as cross-language text information retrieval of English documents by querying and retrieving English documents in Vietnamese. Background technique [0002] Cross-language information retrieval refers to the technology of retrieving information resources in other languages ​​with a query in one language. The Vietnamese-English cross-language information retrieval method is a cross-language retrieval problem of querying and retrieving English documents in Vietnamese, where the Vietnamese language in which the query is expressed is called the source language, and the English language of the retrieved documents is called the target language. With the increasingly close exchanges between China and ASEAN countries, the research ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/3337
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products