Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text similarity calculation method and device and storage medium

A technology of text similarity and calculation method, applied in the field of device, storage medium, and text similarity calculation method, can solve the problems that the effect of text similarity is not very good, and the similarity of text words is not considered, and the calculation accuracy is achieved. Effect

Active Publication Date: 2019-06-14
武汉瓯越网视有限公司
View PDF11 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The existing method of expressing text similarity by calculating the distance between word vectors is the similarity of text from the perspective of semantics. Generally, the similarity of words used in the text is not considered, so the effect of evaluating text similarity is not very good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity calculation method and device and storage medium
  • Text similarity calculation method and device and storage medium
  • Text similarity calculation method and device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Hereinafter, embodiments of the present disclosure will be described with reference to the drawings. It should be understood, however, that these descriptions are exemplary only, and are not intended to limit the scope of the present disclosure. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concepts of the present disclosure.

[0029] The terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting of the present disclosure. The words "a", "an" and "the" used herein shall also include the meanings of "plurality" and "multiple", unless the context clearly indicates otherwise. The terms "comprising", "comprising", etc. used herein indicate the presence of stated features, steps, operations and / or components, but do not exclude the presence or addition of one or more other features, steps, operations or components.

[003...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A text similarity calculation method is applied to the technical field of computer application and comprises the following steps of respectively performing word segmentation processing on two to-be-processed texts to obtain two first vocabulary sets, and calculating first similarity of the two to-be-processed texts based on the two first vocabulary sets; inputting the two texts into a preset N-gram language model to obtain two second vocabulary sets, and calculating second similarity of the two texts based on the two second vocabulary sets; and calculating the similarity of the two texts basedon the first similarity and the second similarity according to a preset adjustment parameter of the first similarity and a preset adjustment parameter of the second similarity. The invention furtherprovides a text similarity calculation device and a storage medium. In the process, when the text similarity is calculated, the similarity between text semantics is considered, and the similarity of text words is also considered, so that the calculation of the text similarity is more accurate.

Description

technical field [0001] The present disclosure relates to the technical field of computer applications, in particular to a method, device and storage medium for calculating text similarity. Background technique [0002] Text similarity is a way to quantify the similarity between texts. In recent years, it has been widely used in information retrieval, document duplication detection, machine translation, public opinion monitoring and other fields. [0003] In the existing technology for calculating text similarity, the space vector model method is used to map the text into word vectors in the semantic space, and calculating the spatial distance between word vectors is a common practice for calculating text similarity. [0004] The existing method of expressing text similarity by calculating the distance between word vectors is the similarity of text from the perspective of semantics. Generally, the similarity of words used in the text is not considered, so the effect of evalua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33
Inventor 徐乐乐
Owner 武汉瓯越网视有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products