Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting mining

An extension method and item weight technology, which is applied in the field of information retrieval and can solve problems such as query subject drift and mismatch.

Inactive Publication Date: 2019-04-02
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The invention proposes an Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting and mining, which is applied to the field of cross-language information retrieval, applied to actual cross-language search engines and cross-language information retrieval systems, and improves cross-language retrieval performance , to solve the problem of query subject drift and word mismatch in cross-language information retrieval

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting mining
  • Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting mining
  • Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] In order to better illustrate the technical solution of the present invention, the relevant concepts involved in the present invention are introduced as follows:

[0061] 1. Antecedents and postconditions of feature word association rules: Let x and y be any set of feature word items, and the implication of the form x → y is called feature word association rule, where x is called the antecedent of the rule, y is called the consequent of the rule.

[0062] 2. Indonesian-English cross-language post-translation mixed expansion:

[0063] Mining feature-word weighted association rules containing translated original query terms from Indonesian-English cross-language retrieval preliminary inspection user-related feedback documents, and extracting from the feature-word weighted association rules those subsequent items are association rules of the original query terms The antecedent item set and the subsequent item set whose antecedent is the association rule of the original qu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an Indonesian-English cross-language post-translation hybrid expansion method based on item weight sorting mining, which comprises the following steps of: firstly translating Indonetic language queries into English and retrieving English documents, and constructing an initial check user related feedback document set; fusing the weight and frequency of the item set with thetotal weight of the feature words and the total number of the documents of the initial detection user related feedback document set, mining the feature word frequent item set from the initial detection user related feedback document set, pruning through item weight sorting, and using the confidence-correlation coefficient evaluation framework to mining feature word weighted association rules fromfeature word frequent item sets; using an association rule antecedent item set with the posterior part being the original query lexical item and an association rule posterior part item set with the posterior part being the original query lexical item as post-translation extension words to realize Indonesian-English cross-language post-translation hybrid expansion. According to the method, the itemweight sorting pruning method is adopted, so that the mining efficiency is improved, the extension words related to the original query are mined, and the Indonesian-English cross-language text information retrieval performance is improved, and the application value and the popularization prospect are good.

Description

technical field [0001] The invention belongs to the field of information retrieval, and specifically relates to an Indonesian-English cross-language post-translation mixed expansion method based on item weight sorting and mining. Background technique [0002] Cross-language query expansion is one of the key technologies to improve and improve the performance of cross-language information retrieval. It refers to the process of cross-language information retrieval, using a certain strategy to find the expansion words related to the original query, and the combination of the expansion words and the original query is obtained. The process of re-querying and re-retrieving. According to different stages of cross-language information retrieval, cross-language query expansion can be divided into three types: query pre-translation expansion, query post-translation expansion and hybrid query expansion. The query pre-translation expansion model refers to that before the source languag...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F16/332
CPCG06F40/58
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products