Text similarity calculation method and device

A text similarity and calculation method technology, which is applied in the field of text similarity calculation methods and devices, can solve problems affecting the accuracy of classification and reduce the accuracy of text similarity, and achieve the effect of improving accuracy and customer service limitations

Inactive Publication Date: 2018-05-01
ULTRAPOWER SOFTWARE
View PDF5 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Different individuals have different ways of expressing, and the same meaning can be expressed with different words, so there may be various ways of expressing the text with the same meaning. Only the same words are used to determine the similarity between two texts. degree will seriously reduce the accuracy of text similarity calculation, and when the text similarity calculation method is used to classify the business problems raised by customers, it will seriously affect the accuracy of classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity calculation method and device
  • Text similarity calculation method and device
  • Text similarity calculation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0051] A text similarity calculation method, such as figure 1 As shown, the method includes the following steps:

[0052] 110. Using the first word vector training model to process all the vocabulary in the first text and the second text to obtain word vectors of all vocabulary, wherein the word vector includes the context relationship of the corresponding vocabulary;

[0053] The first word vector training model here is a model capable of obtaining word ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a text similarity calculation method and device. According to the embodiment of the invention, vocabularies in two texts are trained by using a first vocabulary vector training model to obtain a vocabulary vector corresponding to each vocabulary, then the cosine similarity of two vocabulary vectors is calculated, and finally the similarity of the two textsis calculated by using the maximum cosine similarity of the word vocabularies. The vocabulary vector comprises context information of the corresponding vocabulary, so that the cosine similarity of thevocabulary vectors can reflect the meaning similarity of the corresponding vocabularies, thus the similarity, which is calculated by using the cosine similarity, of the two texts can accurately reflect the meaning similarity of the two texts, that is, the accuracy of calculation for the text similarity can be improved by using the cosine similarity, and thus the limitations brought about by a circumstance that the similarity of two texts can only be determined by using the same vocabularies in the prior art are overcome.

Description

technical field [0001] The embodiments of the present invention relate to the technical fields of text processing and man-machine conversation, and more specifically, relate to a method and device for calculating text similarity. Background technique [0002] At present, the automatic business question answering system has been widely used in many aspects such as bank customer service systems, online shopping customer service systems, and communication industry customer service systems. It can automatically provide answers to business questions raised by customers, improve service efficiency, and can Avoid manually answering business questions raised by customers, effectively saving human resources and labor costs. [0003] The automatic question answering system for business questions will divide the business questions raised by customers into corresponding categories, and then automatically give a unified standard answer according to the categories of business questions. T...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06F17/27G06F17/30
CPCG06F16/3329G06F16/3344G06F40/289G06F18/22
Inventor 蒋宏飞王萌萌晋耀红杨凯程
Owner ULTRAPOWER SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products