Text comparison algorithm based on a stacked bidirectional lstm neural network

A technology of text similarity and neural network, which is applied in the field of text similarity calculation based on stacked bidirectional lstm neural network, can solve the problem of low accuracy of text similarity algorithm, achieve the reduction of propagation gradient disappearance, accurate similarity, lstm Accurate performance of neural network models and classifiers

Active Publication Date: 2019-02-15
重庆邂智科技有限公司
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Natural language itself has a variety of expressions. Due to the large number of synonyms and synonymous phrases that appear in text pairs, there

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text comparison algorithm based on a stacked bidirectional lstm neural network
  • Text comparison algorithm based on a stacked bidirectional lstm neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0041] The text similarity calculation method based on the stacked bidirectional lstm neural network in the present embodiment comprises the following steps:

[0042] First, crawl from the Internet with crawlers, or collect existing corpus text classics, prepare unlabeled large corpus texts, segment the corpus texts according to the relevant rules set according to the existing technology, and calculate the word vectors from the word segmentation. Wherein, the method for obtaining the word vector adopts Word2vec or other existing algorithms. The word vector obtained from the unlabeled corpus text is used as the input word vector.

[0043] Then, prepare corpus texts with similarity labels, segment these corpus texts and calculate word vectors. Use the word vector obtained from the corpus text with similarity labels as the target word vector, select multiple target word vectors from the target word vector to form the target sentence word vector, and use the target sentence word ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a text comparison algorithm based on a stacked bidirectional lstm neural network, which relates to the field of natural language processing, and includes the followingsteps: Step 1, inputting a sentence segmentation and calculating a word vector, and obtaining a word vector as an input word vector; Second, input the input word vector into the lstm neural network inthe form of network stacking to obtain the input sentence vector; in step 3, obtain the sentence vector of the two input sentences according to steps 1 and 2; input the sentence vector of the two input sentences Go to the classifier and get the similarity of the two sentences. The application of the present invention enables accurate text similarity calculation.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a text similarity calculation method based on a stacked bidirectional LSTM neural network. Background technique [0002] In the process of natural language processing, it often involves how to measure the similarity between two texts. We all know that text is a high-dimensional semantic space. How to abstract it and decompose it so that it can be quantified from a mathematical perspective its similarity. The text similarity algorithm has a wide range of uses, such as querying the content related to the input sentence in information retrieval, judging whether the meaning of the input question sentence and the knowledge base question sentence are consistent in the intelligent question answering system, and judging the correlation between the input sentence and the document sentence in the reading comprehension task. degree. Therefore, improving the accuracy of text simi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06K9/62G06N3/08
CPCG06N3/084G06F40/247G06F40/284G06F18/2411G06F18/22
Inventor 覃勋辉
Owner 重庆邂智科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products