MG-LSTM-based citation difference matching method and apparatus, and storage medium

A matching method and citation technology, applied in neural learning methods, text database query, unstructured text data retrieval, etc., can solve the problems of low discovery efficiency, difficulty in ensuring accuracy, and heavy workload

Active Publication Date: 2020-12-04
CENT SOUTH UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a citation difference matching method, device and storage medium based on MG-LSTM (Multi-granularity Long Short-Term Memory, multi-granularity Long Short-Term Me

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MG-LSTM-based citation difference matching method and apparatus, and storage medium
  • MG-LSTM-based citation difference matching method and apparatus, and storage medium
  • MG-LSTM-based citation difference matching method and apparatus, and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0059] In order to realize the purpose of the present invention, it is first necessary to build and train a citation difference recognition model, and the specific process is as follows.

[0060] Citations include metadata such as titles, authors, publishers, etc. Since different citations are composed of different metadata and have different text characteristics, this embodiment uses two types of word embedding and character embedding, and the corresponding The title, author, and publisher are segmented at the granularity of words or characters, and then their sequences are mapped to a low-dimensional vector space. In this embodiment, the title metadata is taken as an example for specific description. First, the title metadata of the citation to be screened and the credible citation are and converted to caption embedding vectors, respectively and Caption embedding vector and Form the title embedding vector pair, and The elements in are the embedded representati...

Embodiment 2

[0094] This embodiment provides a citation difference matching device based on MG-LSTM, including:

[0095] Data acquisition module: used to acquire the title, author and publisher metadata of citations to be screened and credible citations;

[0096] Granularity Segmentation Module: It is used to use words and characters as the segmentation granularity to convert the title, author, and publisher metadata of citations to be screened and credible citations into title embedding vector pairs, author embedding vector pairs, and publishing house embedding vector pairs. ;

[0097] Embedded vector weighting module: used to learn the weights of the title embedding vector pair, author embedding vector pair, and publisher embedding vector pair respectively based on the attention mechanism, and update the title embedding vector pair, author embedding vector pair, and publishing house embedding vector pair based on the corresponding weights right;

[0098] Citation difference recognition...

Embodiment 3

[0106] This embodiment provides a computer-readable storage medium, which stores a computer program, and the computer program is suitable for being loaded by a processor and executing the MG-LSTM-based citation difference matching method as described in Embodiment 1.

[0107] Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a citation difference matching method and apparatus based on MGLSTM, and a storage medium. The method comprises the steps of obtaining titles, authors and publishing house metadata of a to-be-discriminated citation and a trusted citation; respectively segmenting and converting titles, authors and publishing house metadata of the citation to be discriminated and the crediblecitation into a title embedding vector pair, an author embedding vector pair and a publishing house embedding vector pair by taking words and characters as segmentation granularities; respectively learning the weight of each embedded vector pair based on an attention mechanism, and updating each embedded vector pair based on the corresponding weight; and inputting each updated embedded vector pair into a pre-trained citation difference identification model, and outputting a citation difference matching result category. Citation fine-grained discrimination can be carried out, and the difference type of the citation is judged; by introducing an attention mechanism, the mutual relation between metadata of a to-be-discriminated citation and metadata of a trusted citation can be better represented, feature information in two directions is reserved at the same time in combination with a bidirectional LSTM network, and the discrimination precision is ensured.

Description

technical field [0001] The present invention relates to the technical field of citation difference identification, in particular to a MG-LSTM-based citation difference matching method, device and storage medium. Background technique [0002] In recent years, with the continuous increase of the country's investment in scientific research, the number of applications for various science fund projects has also hit new highs, and a large amount of citation data is included in the project applications. In the face of massive citation data, relying on management personnel to discover or solicit reports from the society, the workload is huge, the efficiency is low, and the accuracy is difficult to guarantee, so it is difficult to meet the actual needs. [0003] Citation screening refers to the authenticity of the citation data in the fund project application, and provides auxiliary support for the applicant's preliminary research foundation evaluation in the fund project form re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/33G06F40/258G06F40/284G06F40/289G06N3/04G06N3/08
CPCG06F40/289G06F40/258G06F40/284G06F16/3331G06N3/049G06N3/08G06N3/044
Inventor 王也龙军章成源魏翔翔杨展
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products