Cross-language information retrieval method based on conceptual graph

An information retrieval and concept map technology, applied in the field of cross-language information retrieval based on concept maps, can solve problems such as inability to retrieve, disambiguation of word translation, insufficient utilization of global speech features, etc.

Active Publication Date: 2019-10-08
CETC BIGDATA RES INST CO LTD
View PDF9 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Traditional retrieval methods often use shallow text feature information to evaluate similarity. Even if semantics is used for similarity comparison, it is often only reflected at the word level, and the use of global phonetic features is not sufficient.
In the cross-language similarity retrieval task, due to the differences in the grammar of different languages, the translation of words also has the problem of disambiguation, and generally cannot be retrieved by simple text features.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-language information retrieval method based on conceptual graph
  • Cross-language information retrieval method based on conceptual graph
  • Cross-language information retrieval method based on conceptual graph

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0068] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0069] The present invention proposes a conceptual graph based Cross-language Information Retrieval (Conceptual Graph based Cross-language Information Retrieval), hereinafter abbreviated as CG-CLIR); CG-CLIR, a text-based cross-lingual information retrieval model for evaluation. The model leverages holistic embeddings of concept maps for semantic retrieval of bilingual texts. In the implementation, it is first necessary to preprocess the bilingual corpus in the retrieval set, and construct a candidate set for storing the embedded representation of the concept map after constructing the concept map...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-language information retrieval method based on a conceptual graph. The concept map-based cross-language information retrieval method is a cross-language similarity evaluation framework CG-CLIR method, which comprises the following steps: S1, based on the concept edge representation of Skip-Gram; S2, high-order semantic embedding and similarity calculation of the fusion side information; the step S1 includes text preprocessing, concept graph edge semantic embedding. The step S2 includes LSTM based graph level semantic embedding, graph embedding based similarity calculation. According to the method, in text cross-language information retrieval, language obstacles are crossed, and the semantic retrieval effect is achieved under the condition that translation isnot conducted.

Description

technical field [0001] The invention relates to cross-language information retrieval, in particular to a cross-language information retrieval method based on concept maps. Background technique [0002] At present, vector representation and processing of text has become the mainstream of text analysis tasks. The most common one is to vectorize the representation of words, such as the one-hot model and word embedding model, which represent n words in a sentence as n A d-dimensional vector, so that the sentence is an n*d-dimensional matrix, which is convenient for processing. Another method is to map sentences or documents into a vector, and construct paragraphs and texts into a matrix of vectors. When dealing with this method, more consideration is given to longer sequence information issues, and it can better represent global information. However, since the sentence is variable in length, and as the basic representation unit of semantics, words can have many different combi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F16/36G06F17/27G06K9/62
CPCG06F16/3344G06F16/367G06F16/35G06F40/30G06F18/22
Inventor 刘刚张森南刘汪洋雷吉成胡昱临
Owner CETC BIGDATA RES INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products