Method and device for determining text similarity

A technology of text similarity and determination method, applied in the field of text similarity determination method and equipment, can solve the problems of low accuracy, inaccuracy, unable to reflect the similarity of text, etc., and achieve the effect of accurate similarity

Active Publication Date: 2022-05-24
SOUTH CHINA NORMAL UNIVERSITY
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, if the comprehensive information in the text is ignored, for example, text 1 "I chased a dog today" and text 2 "a dog chased me today", the meanings of these two text sentences are opposite, but according to the current absolute For most similarity algorithms, the word segmentation in the two texts is almost the same, so it is obviously inaccurate to determine that the similarity between the two texts is high, or even the same.
[0004] It can be seen that the accuracy of the similarity obtained by the current text similarity calculation method is low, and cannot reflect the similarity of the text itself.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for determining text similarity
  • Method and device for determining text similarity
  • Method and device for determining text similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0086] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, the present invention can be implemented in many other ways different from this description, and those skilled in the art can make similar promotions without departing from the connotation of the present invention. Therefore, the present invention is not limited by the specific embodiments disclosed below.

[0087] The accuracy of the similarity obtained by the current text similarity calculation method is low, and cannot reflect the similarity of the text itself.

[0088] In view of this, an embodiment of the present invention provides a new text similarity determination method, which comprehensively considers the grammatical similarity and topic similarity between two texts to determine the similarity between each text in the coming year. Compared with the prior art, the similarity between two texts is determined only by...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a new method and device for determining text similarity, which can accurately reflect the similarity of the text itself. Wherein, the method for determining text similarity includes: obtaining the first text and the second text whose similarity is to be determined; determining the grammatical similarity and topic similarity of the first text, and determining the grammatical similarity of the second text 1. Topic similarity; determining the similarity between the first text and the second text according to the determined grammatical similarity and topic similarity.

Description

technical field [0001] The invention relates to the field of computer technology, and in particular, to a method and device for determining text similarity. Background technique [0002] In the prior art, the similarity of two texts is generally judged by segmenting the two texts, and then judging the repeated words in the two texts. [0003] However, if the comprehensive information in the text is ignored, for example, text 1 "I chased a dog today" and text 2 "A dog chased me today", the meanings of these two text sentences are opposite, but according to the current absolute In most similarity algorithms, the divided word segmentations in these two texts are almost the same, so it is obviously inaccurate to determine that the two texts are highly similar or even the same. [0004] It can be seen that the accuracy of the similarity obtained by the current calculation method of the similarity of the text is low and cannot reflect the similarity of the text itself. SUMMARY ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06F40/253G06K9/62
CPCG06F40/211G06F40/289G06F40/30
Inventor 周春郑百成黄妍明方永毅瞿荣蒋运承
Owner SOUTH CHINA NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products