Method for acquiring text similarity, terminal device and medium

A technology of text similarity and acquisition method, applied in the field of computer-readable storage medium and text similarity acquisition, can solve problems such as low text similarity calculation accuracy, and achieve the effect of improving calculation accuracy and comparison efficiency.

Active Publication Date: 2018-10-26
PING AN TECH (SHENZHEN) CO LTD
View PDF10 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, an embodiment of the present invention provides a method for acquiring text similarity, a terminal device, a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for acquiring text similarity, terminal device and medium
  • Method for acquiring text similarity, terminal device and medium
  • Method for acquiring text similarity, terminal device and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In the following description, specific details such as specific system structures and technologies are presented for the purpose of illustration rather than limitation, so as to thoroughly understand the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

[0040] In order to illustrate the technical solutions of the present invention, specific examples are used below to illustrate.

[0041] figure 1 The implementation flow of the information input method provided by the embodiment of the present invention is shown, and the method flow includes steps S101 to S107. The specific implementation principle of each step is as follows:

[0042] S101: Obt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention is applicable to the technical field of data processing, and provides a method for acquiring text similarity, a terminal device, and a medium. The method comprises: after acquiring a plurality of word segments corresponding to each to-be-analyzed text, storing the word segments in a word bag; acquiring TF-IDF information of each word segment in a word bag model; based on theTF-IDF information associated with each to-be-analyzed text, generating text set feature matrices corresponding to the plurality of comparison texts and text vectors corresponding to the reference texts respectively; performing singular value decomposition on the text set feature matrices, and according to the obtained word feature matrices and the feature vector weight matrices, performing inverse mapping processing on the text vectors to obtain second feature vectors; and respectively calculating the similarity between each second feature vector and the first feature vector, and outputting acalculation result as the similarity between the preset text and the comparison text matched by the second feature vector. According to the technical scheme of the present invention, the calculationaccuracy of the text similarity is improved, and the comparison efficiency of the text is improved.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to a method for acquiring text similarity, a terminal device and a computer-readable storage medium. Background technique [0002] Text similarity is used to measure the degree of similarity between texts. In the traditional way, the text similarity can be determined by human judgment. However, manually judging a large number of similar texts is often a time-consuming and cumbersome task. Therefore, in order to solve this problem, with the continuous development of scientific research, word frequency statistics and vector space models such as simhash have been derived. These vector space models calculate text similarity by identifying words that are common in two articles, and based on information such as the presence or absence of words and the word frequency of each word. Therefore, only when there are a large number of identical words in two articles, the ca...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/205G06F40/289
Inventor 李育儒王鸿滨吴晓贝汪伟
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products