Text similarity calculation method and device

A text similarity and calculation method technology, which is applied in the field of text similarity calculation methods and devices, can solve the problem of low precision and achieve the effect of quickly calculating text similarity

Active Publication Date: 2021-02-12
北京育学园健康管理中心有限公司
View PDF10 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the technical problem of low accuracy of text similarity calculation results in the above-mentioned prior art, the application provides a text similarity calculation method and device

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity calculation method and device
  • Text similarity calculation method and device
  • Text similarity calculation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0049] Unless otherwise defined, all technical and scientific terms used in the embodiments of the present invention have the same meaning as commonly understood by those skilled in the technical field of the present invention. The terms used in the description of the present invention in the embodiments of the present invention are only for ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a text similarity calculation method and device. According to the embodiment of the invention, the method comprises: respectively determining a label keyword set and a non-label keyword set in a first text and a second text; determining a first similarity between a first label keyword set of a first text and a second label keyword set of a second text basedon a preset hierarchical tree used for representing an association relationship between keywords; determining a second similarity between the first non-label keyword set of the first text and the second non-label keyword set of the second text based on a preset semantic model; and finally, determining the text similarity between the first text and the second text according to the first similarityand the second similarity, and calculating the similarity between the texts by extracting keywords in the texts and utilizing the keywords, thereby achieving the purpose of quickly calculating the text similarity.

Description

technical field [0001] The invention belongs to the technical field of the Internet, and in particular relates to a text similarity calculation method and device. Background technique [0002] With the rapid development of Internet information technology, people can easily upload or download shared document information, and this sharing mode will directly lead to the existence of massive documents. Currently, sentence matching or keyword matching is mainly used to determine the similarity between texts. However, due to the complex and changeable grammatical structure of Chinese sentences, the heterogeneity of semantic context and other factors, the calculation of similarity between Chinese sentences has been increased. Difficulty, so the existing technology cannot quickly and accurately obtain similar documents from batch documents. [0003] In view of this situation, a large number of solutions have been proposed in the prior art, which are mainly divided into: prior art 1...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F40/194G06F40/284G06F40/30G06F16/35G06F16/31
CPCG06F40/194G06F40/284G06F40/30G06F16/353G06F16/31G06F18/22
Inventor 张姗姗姜巍于游赵永强
Owner 北京育学园健康管理中心有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products