Unlock instant, AI-driven research and patent intelligence for your innovation.

Text similarity determination method, device, equipment and storage medium

A text similarity and determination method technology, applied in the field of devices, equipment and storage media, text similarity determination method, can solve the problem of low accuracy of similarity

Active Publication Date: 2019-01-04
KINGSOFT
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But in fact, the text to be analyzed 1 and the text to be analyzed 2 are completely different, so the accuracy of the determined similarity will be low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity determination method, device, equipment and storage medium
  • Text similarity determination method, device, equipment and storage medium
  • Text similarity determination method, device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0061] In the prior art, the process of determining text similarity only considers whether the words are the same, but does not consider the meaning of the words in the text context. In practice, the same words may have different meanings in different contexts. In this way, it is possible to judge words that are the same but have different contextual meanings as the same words, or judge words that are written differently but have the same contextual meaning as...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a text similarity determination method, a device, equipment and a storage medium, wherein, the method comprises the following steps: determining the text to beanalyzed; carrying out clauses on the text to be analyzed, and obtaining a plurality of sentences corresponding to the text to be analyzed; for each sentence, the sentence being input to a pre-trained neural network model to obtain a semantic feature vector corresponding to the sentence, wherein, the neural network model is trained according to a plurality of first training samples and a plurality of first training samples corresponding to associated sentences respectively; determining a specific feature vector corresponding to the text to be analyzed according to the semantic feature vectorcorresponding to each sentence; the similarity between the specific feature vectors corresponding to the text to be analyzed being calculated, and the similarity being used as the similarity between the text to be analyzed. Thus, the accuracy of text similarity determination can be improved.

Description

technical field [0001] The present invention relates to the technical field of computer applications, in particular to a text similarity determination method, device, equipment and storage medium. Background technique [0002] Text similarity is used to evaluate the degree of similarity between texts, and is widely used in scenarios such as text clustering analysis, text matching, and repetition rate detection. For example, it can be used to detect plagiarism of papers, etc. [0003] In the prior art, word-based methods are used to determine the similarity between texts. Specifically, it can be: segment the text to be analyzed for the similarity to be analyzed; calculate the number of identical words or the probability of the same word between the texts to be analyzed; Similarity, for example, 80% of the words in the two texts are the same, determine the similarity between the two texts as 0.8 and so on. [0004] In the prior art, when determining text similarity, only wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F16/35G06K9/62
CPCG06F40/289G06F40/30G06F18/22
Inventor 史文丽王晨光
Owner KINGSOFT