A Cross-lingual Information Retrieval Method Based on Concept Map

An information retrieval and concept map technology, applied in the field of cross-language information retrieval based on concept map, can solve the problems of insufficient utilization of global speech features, inability to retrieve, and word translation disambiguation.

Active Publication Date: 2021-06-29
CETC BIGDATA RES INST CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Traditional retrieval methods often use shallow text feature information to evaluate similarity. Even if semantics is used for similarity comparison, it is often only reflected at the word level, and the use of global phonetic features is not sufficient.
In the cross-language similarity retrieval task, due to the differences in the grammar of different languages, the translation of words also has the problem of disambiguation, and generally cannot be retrieved by simple text features.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Cross-lingual Information Retrieval Method Based on Concept Map
  • A Cross-lingual Information Retrieval Method Based on Concept Map
  • A Cross-lingual Information Retrieval Method Based on Concept Map

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0047] The present invention proposes a conceptual graph based Cross-language Information Retrieval (Conceptual Graph based Cross-language Information Retrieval), hereinafter abbreviated as CG-CLIR); CG-CLIR, a text-based cross-lingual information retrieval model for evaluation. The model leverages holistic embeddings of concept maps for semantic retrieval of bilingual texts. In the implementation, it is first necessary to preprocess the bilingual corpus in the retrieval set, and construct a candidate set for storing the embedded representation of the concept map after constructing the concept map...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a cross-lingual information retrieval method based on a concept map. The cross-language information retrieval method based on a concept map is a method of a cross-language similarity evaluation framework CG-CLIR, which includes the following steps: S1, based on Skip-Gram The concept graph edge representation; S2, high-order semantic embedding and similarity calculation of fusion edge information; the step S1 includes text preprocessing, concept graph edge semantic embedding; the step S2 includes graph-level semantic embedding based on LSTM, based on Similarity Computation for Graph Embeddings. The method of the present invention overcomes language barriers in text cross-language information retrieval, and realizes the effect of semantic retrieval without translation.

Description

technical field [0001] The invention relates to cross-language information retrieval, in particular to a cross-language information retrieval method based on concept maps. Background technique [0002] At present, vector representation and processing of text has become the mainstream of text analysis tasks. The most common one is to vectorize the representation of words, such as the one-hot model and word embedding model, which represent n words in a sentence as n A d-dimensional vector, so that the sentence is an n*d-dimensional matrix, which is convenient for processing. Another method is to map sentences or documents into a vector, and construct paragraphs and texts into a matrix of vectors. When dealing with this method, more consideration is given to longer sequence information issues, and it can better represent global information. However, since the sentence is variable in length, and as the basic representation unit of semantics, words can have many different combi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/35G06F16/36G06F40/30G06K9/62G06N3/04G06N3/08
CPCG06F16/3344G06F16/367G06F16/35G06F40/30G06F18/22
Inventor 刘刚张森南刘汪洋雷吉成胡昱临
Owner CETC BIGDATA RES INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products