Unlock instant, AI-driven research and patent intelligence for your innovation.

Acquisition method, terminal device and medium of text similarity

A technology of text similarity and acquisition method, applied in the field of text similarity acquisition and computer-readable storage media, can solve the problems of low calculation accuracy of text similarity, achieve the effect of improving calculation accuracy and improving comparison efficiency

Active Publication Date: 2022-04-08
PING AN TECH (SHENZHEN) CO LTD
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of this, an embodiment of the present invention provides a method for acquiring text similarity, a terminal device, and a computer-readable storage medium to solve the problem of relatively low calculation accuracy of text similarity in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acquisition method, terminal device and medium of text similarity
  • Acquisition method, terminal device and medium of text similarity
  • Acquisition method, terminal device and medium of text similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In the following description, specific details such as specific system structures and technologies are presented for the purpose of illustration rather than limitation, so as to thoroughly understand the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

[0040] In order to illustrate the technical solutions of the present invention, specific examples are used below to illustrate.

[0041] figure 1 The implementation flow of the information input method provided by the embodiment of the present invention is shown, and the method flow includes steps S101 to S107. The specific implementation principle of each step is as follows:

[0042] S101: Obt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention is applicable to the technical field of data processing, and provides a text similarity acquisition method, terminal equipment and media. The method includes: after obtaining a plurality of word segmentations corresponding to each text to be analyzed, storing the word segmentation into a word bag model; obtain the TF-IDF information of each word segment in the bag-of-words model; based on the TF-IDF information associated with each text to be analyzed, respectively generate a text set feature matrix corresponding to multiple comparison texts and a reference text corresponding to Text vector; perform singular value decomposition on the text set feature matrix, and perform reverse mapping on the text vector according to the obtained word feature matrix and feature vector proportion matrix, to obtain the second feature vector; calculate each second feature vector and The similarity of the first feature vector, and the calculation result is output as the similarity between the preset text and the comparison text matched by the second feature vector. The invention improves the calculation accuracy rate of the text similarity and improves the text comparison efficiency.

Description

technical field [0001] The invention belongs to the technical field of data processing, and in particular relates to a method for acquiring text similarity, a terminal device and a computer-readable storage medium. Background technique [0002] Text similarity is used to measure the degree of similarity between texts. In the traditional way, the text similarity can be determined by human judgment. However, manually judging a large number of similar texts is often a time-consuming and cumbersome task. Therefore, in order to solve this problem, with the continuous development of scientific research, word frequency statistics and vector space models such as simhash have been derived. These vector space models calculate text similarity by identifying words that are common in two articles, and based on information such as the presence or absence of words and the word frequency of each word. Therefore, only when there are a large number of identical words in two articles, the ca...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06K9/62
CPCG06F40/205G06F40/289
Inventor 李育儒王鸿滨吴晓贝汪伟
Owner PING AN TECH (SHENZHEN) CO LTD