Short text similarity calculation system and training method thereof

A similarity calculation and text similarity technology, applied in computing, neural learning methods, unstructured text data retrieval, etc., can solve the problems of semantic information loss, ignoring word polysemy, etc., to alleviate information loss, avoid Effects isolated from context

Active Publication Date: 2020-05-29
铜陵中科汇联科技有限公司
View PDF6 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The above-mentioned word-based methods ignore the ambiguity of words in different contexts, while text-level encoding-based methods have the problem of semantic information loss

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text similarity calculation system and training method thereof
  • Short text similarity calculation system and training method thereof
  • Short text similarity calculation system and training method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0044] This application discloses a short text similarity calculation system and its training method. The same encoder is used to encode two short texts that need to calculate the similarity, and then the attention mechanism is used to obtain the difference between the first text and the second text. Attention, normalized attention to get the value of similarity.

[0045] see figure 1 ,Such as figure 1 As shown, the specific desc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a short text similarity calculation system and a training method thereof. The system comprises the following modules: a text segmentation module, a text encoder and a text similarity calculation neural network module. According to the short text similarity calculation system, the same encoder is used for encoding two short texts of which the similarity needs to be calculated, then the attention mechanism is used for obtaining the attention of the first text to the second text, and the value of the similarity is obtained by normalizing the attention. According to the neural network, semantic coding of words in the text in the context and semantic coding of the whole text are effectively utilized, attention is used for representing similarity, information loss of text-level semantic coding is relieved, and the problem that word-level semantic coding is isolated from the context is avoided.

Description

technical field [0001] The present application relates to the technical fields of text mining and deep learning, in particular to a short text similarity calculation system and its training method. Background technique [0002] Short text similarity calculation is widely used in question answering systems, text classification, and text clustering. Common text similarity calculation methods include: use words as the basic unit of text to calculate Levenshtein edit distance, treat text as a set of words and calculate text similarity based on word meaning or word vector, use deep neural network to get the overall text Encode and compute similarity based on text-level encoded vectors. The above-mentioned word-based methods ignore the ambiguity of words in different contexts, while text-level encoding-based methods suffer from the loss of semantic information. Contents of the invention [0003] The purpose of this application is to provide a short text similarity calculation ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/117G06F40/126G06F40/205G06F40/30G06N3/08
CPCG06F16/35G06N3/08
Inventor 王丙栋游世学
Owner 铜陵中科汇联科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products