Cross-lingual post-translation hybrid extension method based on feature word weighted association pattern mining

A weighted association and pattern mining technology, applied in the field of information retrieval, can solve problems such as query subject drift and mismatch, achieve improved retrieval performance, significant effect, and reduce query drift

Inactive Publication Date: 2021-07-02
GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The invention proposes a cross-language post-translation mixed extension method based on feature word weighted association mode mining, which is applied to the field of cross-language information retrieval, solves the problems of query topic drift and word mismatch in cross-language information retrieval, and is suitable for actual cross-language search Engine and web cross-language information retrieval system to improve information retrieval performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-lingual post-translation hybrid extension method based on feature word weighted association pattern mining
  • Cross-lingual post-translation hybrid extension method based on feature word weighted association pattern mining
  • Cross-lingual post-translation hybrid extension method based on feature word weighted association pattern mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0067] Such as figure 1 As shown, the cross-language post-translation hybrid extension method based on feature word weighted association pattern mining includes the following steps:

[0068] Step 1: With the help of machine translation tools, the source language query cross-language first retrieves the target language documents, and constructs and preprocesses the relevant feedback document sets for the initial inspection. Specific steps:

[0069] (1.1) The source language user query is translated into the target language by machine translation tools, and the vector space retrieval model is used to retrieve the text document set of the target language to obtain the first target language documents.

[0070] Machine translation tools can be: Microsoft Bing machine translation interface Microsoft Translator API, or Google machine translation interface, etc.

[0071] (1.2) Construct the initial inspection related feedback document set by making correlation judgment on the first ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-language post-translation mixed extension method based on feature word weighted association mode mining. First, the source language query retrieves the target language document for the first time, constructs and preprocesses the initial inspection related feedback document set, and compares the document by item set weight Mining the frequent item sets containing the original query items, pruning the candidate item sets with the item set correlation degree and the item weight of the item set or the largest item weight value, and using the chi-square analysis-confidence evaluation framework to analyze the frequent item sets Mining the association rules of text feature words containing the original query term, extracting the subsequent part is the association rule antecedent of the original query term and the antecedent is the association rule of the original query term, and the latter part is used as the post-translation expansion word to realize cross-language post-translation Hybrid extensions. The present invention overcomes the defects of the existing weighted association rule mining technology, improves the mining efficiency, digs out the expansion words related to the original query, improves and improves the cross-language retrieval performance, and has high application value and Promote prospects.

Description

technical field [0001] The invention belongs to the field of information retrieval, in particular to a cross-language post-translation mixed extension method based on feature word weighted association mode mining. Background technique [0002] Cross-language query expansion is one of the core technologies to improve and improve the performance of cross-language information retrieval. It can solve the long-term problems in cross-language information retrieval, such as serious drift of query topics and word mismatches. , using a certain strategy to find the expansion words related to the original query, the expansion words and the original query combination to get a new query and re-retrieval process. [0003] At present, the rapid growth of network information resources has become network big data with huge economic value and research value. Faced with network information resources with multilingual characteristics, when network users use query expressions in languages ​​the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/31G06F16/33G06F16/35G06F40/58
Inventor 黄名选
Owner GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products